Category: Code

Understanding Begins with Expressive Names

In 2018, I joined a large project halfway through its development. The original engineers had moved on leaving behind convoluted and undocumented code. Working with this type of code is challenging because you can’t differentiate the plumbing from the business domain. This makes debugging difficult and changes unpredictable because you don’t know the impact. It’s like trying to edit a book without understanding the words.

Many engineers believe the measure of success is when the code compiles. I believe it’s when another engineer (or you in six months) understands the ”why” of your code. The original engineers handicapped the future engineers by not documenting and using obtuse names. The names are sometimes the only window into the previous engineer’s thought process.

Donald Knuth famously said:

Programs are meant to be read by humans and only incidentally for computers to execute. – Donald Knuth

Naming

Naming is hard because it requires labeling and defining where and how a piece fits in an application.

Phil Karlton, while at Netscape observed:

There are only two hard things in Computer Science: cache invalidation and naming things.
— Phil Karlton

We see our code through the lens of words and names we use. Names create a language for the next engineer to comprehend. This language paints a picture of how the author bridged the business domain and the programming language.

Lugwid Wittgenstein, a philosopher in the first half of the 20th century, said:

The limits of my language mean the limits of my world. – Ludwig Wittgenstein

The language of our software is only as descriptive as the names we use and using vague names blur the software’s purpose; using descriptive names bring clarity and understanding.

Imagine visiting a country where you don’t speak the language. A simple request such as asking to use the bathroom brings bewildered looks. The inability to communicate is frustrating maybe even scary. An engineer feels the same when confronted with confusing, unclear, or even worse, misleading names.

This feeling is best experienced.

Experience

Examine the first snippet of code, what does this code do? What’s the why?

Take your time.

public class StringHelper
{
    public string Get(string input1, string input2)
    {
        var result = string.Emtpy;
        if(!string.IsNullOrEmtpy(input1) && !string.IsNullOrEmtpy(input2))
        {
            result = $"{input1} {input2}";
        }
        return result;
    }
}

The above code is a simple concatenation of two strings. What the code doesn’t tell you is the “why.” The “why” is so important, without it, it’s difficult to change behavior without understanding the impact. Of course, investigating the code’s usage will likely reveal it’s “why,” but that’s the point. You shouldn’t have to discover the code’s purpose, instead, the author should have left clues, it’s their responsiblity to do so.

Let’s revisit the code, but with a little “why” sprinkled in.

Again, take your time, observe the difference you feel when reading this code.

    public class FirstAndLastNameFormatter
    {
        public string Concatenate(string firstName, string lastName)
        {
            var fullName = string.Emtpy;
            if(!string.IsNullOrEmtpy(firstName) && !string.IsNullOrEmtpy(lastName))
            {
                fullName = $"{firstName} {lastName}";
            }
            return fullName;
        }
    }

The “why” brings the code to life, there’s a story to read.

Communicate

Communicating the intent and the design to the next engineer allows software to live and to grow because if engineers can’t modify the software, it dies. This is a tragedy, even more so when it’s a result of poor design and lack of expressiveness — each is preventable with knowledge.

Do the next engineer a favor and be expressive in your code. Use descriptive names and capture the “why” because who knows, the next engineer might be you.

Codifying the Secret Sauce

Each application has its secret sauce, it’s reason for existing. Codifying the secret sauce is instrumental in writing maintainable and successful applications.

Wait. What is codifying? Patience my friend, we’ll get there.

First let’s hypothesize:

You’ve just been promoted to Lead Software Engineer (Congratulations!). Your CEO’s first task is creating a new product for the company. It’s a ground-up accounting application. The executives feel having a custom accounting solution will give them an edge on the competition.

A few months have passed, most of the cross-cutting concerns are developed (yay you!). The team is now focused on the yummy goodness of the application: the business domain (the secret sauce). This is where codifying the secret sauce begins.

Codifying, is putting structure around an essential concept in the business domain.

In accounting, the price (P) earning (E) ratio (P/E Ratio) is a measurement of earnings for a company. A high P/E ratio suggests high earnings growth in the future. The P/E ratio is calculated by taking the market value per share (share price) divided by the earnings per share (profit – dividends / # of outstanding shares).

A simple, and I argue, naive implementation:

public class Metric
{
    public string Name { get; set; }
    public decimal Value {get; set}
    public int Order {get; set;}
}
public class AccountingSummary
{
    public Metric[] GetMetrics(decimal price, decimal earnings)
    {
        var priceEarningsRatio = price/earnings;
        
        var priceEarningsRatioMetric = new Metric 
        {
            Name = "P/E Ratio",
            Value = priceEarningsRatio,
            Order = 0
        }
        return new [] {priceEarningsRatioMetric};
    }
}

If this is only used one place, this is fine. What if you use the P/E ratio in other areas?

Like here in PriceEarnings.cs

var priceEarningsRatio = price/earnings;

And here in AccountSummary.cs

var priceEarningsRatio = price/earnings;

And over here in StockSummary.cs

var priceEarningsRatio = price/earnings;

The P/E Ratio is core to this application, but how it’s implemented, hardcoded in various places, makes the importance of the P/E Ratio lost in a sea of code. It’s just another tree in the forest.

You also open to the risk of changing the ratio in one place but not the other. This may throw off downstream calculations. These types of errors are notoriously difficult to find.

Often testers will assume if it works in one area, it’s correct in all areas. Why wouldn’t the application use the same code to generate the P/E Ratio for the entire application? Isn’t this the point of Object Oriented Programming?

I can imagine an error like this making it into production and not discovered until a visit from your executive who demands to know why the SEC’s P/E Ratio calculations are different from what the company filed. That’s not a great place to be.

Let’s revisit our P/E ratio implementation and to see how we can improve upon our first attempt.

In accounting systems formulas are a thing, let’s put structure around formulas by adding an interface:

public interface IFormula
{
    decimal Calculate<T>(T model);
}

Each formula now is implemented with this interface giving us consistency and predictability.

Here’s our improved P/E Ratio after implementing our interface:

We’ve added a PriceEarningsModel to pass the needed data into our Calculate method.

public class PriceEarningsModel
{
    public decimal Price {get; set;}
    public decimal Earnings {get; set;}
}

Using our PriceEarningsModel, we’ve created an implementation of the IFormula interface for the P/E Ratio.

public class PriceEarningsRatioFormula : IFormula
{
    public decimal Calculate<PriceEarningsModel>(PriceEarningsModel model)
    {
        return model.Price / model.Earnings;
    }
}

We’ve now codified the P/E Ratio. It’s a first-class concept in our application. We can use it anywhere. It’s testable, and a change impacts the entire application.

As a reminder, here’s the implementation we began with:

public class Metric
{
    public string Name { get; set; }
    public decimal Value {get; set}
    public int Order {get; set;}
}
public class AccountingSummary
{
    public Metric[] GetMetrics(decimal price, decimal earnings)
    {
        var priceEarningsRatio = price/earnings;
        
        var priceEarningsRatioMetric = new Metric 
        {
            Name = "P/E Ratio",
            Value = priceEarningsRatio,
            Order = 0
        }
        return new [] {priceEarningsRatioMetric};
    }
}

It’s simple and gets the job done. The problem is the P/E Ratio, which is a core concept in our accounting application doesn’t stand out. Engineers not familiar with the application or the business domain won’t understand its importance.

Our improved implementation uses our new P/E Ratio class. We inject the PriceEarningsRatioFormula class into our AccountSummary class.

We replace our hardcoded P/E Ratio with our new `PriceEarningsRatioFormula` class.

public class AccountingSummary
{
    private PriceEarningsRatioFormula _peRatio;
    
    public AccountingSummary(PriceEarningsRatioFormula peRatio)
    {
        _peRatio = peRatio;
    }
    
    public Metric[] GetMetrics(decimal price, decimal earnings)
    {
        var priceEarningsRatio = _peRatio.Calculate(new PriceEarningsModel 
        { 
            Price = price, 
            Earnings = earnings
        });
        
        var priceEarningsRatioMetric = new Metric 
        {
            Name = "P/E Ratio",
            Value = priceEarningsRatio,
            Order = 0
        }
        return new [] {priceEarningsRatioMetric};
    }
}

One could argue there is a bit more lifting with the PriceEarningsRationFormula over the previous implementation and I’d agree. There is a bit more ceremony but the benefits are well worth the small increase in code and in ceremony.

First, we gain the ability to change the P/E ratio for the entire application. We also have a single implementation to debug if defects arise.

Lastly, we’ve codified the concept of the PriceEarningsRatioFormula in the application. When a new engineer joins the team, they’ll know formulas are essential to the application and the business domain.

There are other methods of codifying (encapsulate) the domain, such as microservices and assemblies. Each approach has its pros and cons, you and your team will have to decide what’s best for your application.

Ensconcing key domain concepts in classes and interfaces creates reusable components and conceptual boundaries. Making an application easier to reason with, reducing defects and lowering the barriers to onboarding new engineers.

Garbage Collection Types in .Net Core

Memory management in modern languages is often an afterthought. For all intents and purposes, we write software without nary a thought about memory. This serves us well but there are always exceptions…

In California, there are extensive financial reporting requirements for Local Education Agencies (LEA), an LEA can be a county, a district, a charter or a single school. Most LEAs create their own financial reports which are usually centered around Excel, it’s no surprise when each report is different. To solve this problem the California Board of Education commissioned software to generate financial reports. 

I was a part of the development team. 

My first stop was the testing logs, Ed-Pro’s logs pointed to high memory usage, perhaps there was a memory leak? An engineer observed that Ed-Pro’s calculations used a large amount of short-lived memory. If the memory wasn’t cleaned up quickly, it could appear like a memory leak.

Ed-Pro is built on top of .Net Core, Microsoft’s multi-platform framework. In .Net Core, memory is divided into three categories: Short-lived (Gen0), medium lived (Gen1), and long-lived (Gen2). Gen0 is for short-lived data that quickly goes out of scope, Gen1 is for medium lived memory that hangs around for a bit longer, it too also eventually goes out of scope and Gen2 is long-lived memory that may live for the life of the application. Gen0 memory is constantly reclaimed, Gen1 is reclaimed less frequently than Gen0, and Gen2 is reclaimed even less frequently than Gen1.

The only sure way to understand the memory usage of Ed-Pro was to profile it, below is a screenshot using dotMemory by JetBrains.

As suspected, we found large amounts of Gen0 memory (the blue), so much so, it appeared that Garbage Collection couldn’t keep up. A strategy to compensate for a large amount of memory, caused Garbage Collection to oscillated between increasing memory space (adding more memory for the application’s use) and cleaning it up. During the cleanup cycles, the application is unresponsive.

At first, we were stumped, isn’t the purpose of the GC to keep memory tidy? Two articles were instrumental in our understanding of how Garbage Collection works in .Net: Mark Vincze’s article Troubleshooting high memory usage with ASP.Net Core on Kubernetes and Fundamentals of Garbage Collection by Microsoft. Both are great reads and brought clarity to the memory usage in Ed-Pro. 

Here’s a summary of what we learned, there are two types of Garbage Collection in .Net: Server Garbage Collection and Workstation Garbage Collection.

Server Garbage Collection makes a couple of assumptions: First, there is ample memory available and second, the processors are multi-core and are fast. Both can be true, but we live in a world of virtual machines and Docker where it’s more likely that both assumptions are false.

Server Garbage Collection allows memory to build, at some point, it does one of two things: it either increases the memory space allowing memory to grow or it frees up orphaned memory. When it chooses to free memory, the Garbage Collection starts the process on a high priority thread. The high priority thread is a higher priority than the application; if the machine is fast, the clean up shouldn’t be noticed. However, if it’s not, it’ll cause the application to halt until the clean up is completed.

Workstation Garbage Collection operations differently. It continuously runs reclaiming memory on a thread with the same priority as the application. This means it’s also competing for resources with the application which can cause application slowness. The upside is the application’s memory usage can stay quite low, primarily when it uses large amounts of Gen0.

As a default, if .Net Core detects a server, it runs the Server Garbage Collection type, which was the case with our application. To run the Workstation Garbage Collection type add the following snippet to your project file:

  <PropertyGroup> 
    <ServerGarbageCollection>false</ServerGarbageCollection>
  </PropertyGroup>

We made this configuration change to Ed-Pro, using dotMemory, we profiled Ed-Pro’s memory with Workstation Garbage Collection enabled and loaded the same screens as in the previous test. Here are the results:

The memory usage is significantly decreased. The Gen0 allocations are virtually non-existent. Beyond the differences in the graph, the Server Garbage Collection memory usage topped 1 gig while the Workstation Garbage Collection topped at roughly 200 megs.

Every application is different. Our application used a ton of temporary data and thus uses a ton of Gen0 memory. Your application may leverage longer lived memory such as Gen1 or Gen2 in which Server Garbage Collection makes a whole lot of sense. My advice is to profile your memory under different conditions for an idea of how memory is used and then decide which mode is best for you application.

You Are Not Your Code

It’s not personal.

Your code reflects neither your beliefs, nor your upbringing, nor your character.

Your thoughts and your opinions evolve, new ideas form, and you change. 

The you of today will be different from the you of tomorrow.

Embrace the difference, you and your code are better because it.

The Collection Comparer, Finding the Differences Between Two Collections

Have you had to compare two collections and execute some logic based on whether the item is in the source collection, in the comparing collection or in both? Yeah, me too, I needed to merge data from the UI and the database. I couldn’t find a good solution, so, I wrote a collection comparer.

To illustrate how this works let’s look at an example.

In the source data we have the values 1, 3, 4, 6, and in the
comparing collection we have the values 1, 2, 3, 4, 5.

The source data is missing the 2 and the 5 when compared to the comparing collection, and the comparing collection is missing the 6 when compared to the source collection.

Let’s walk through this merge:

  1. in both (update)
  2. only in the comparing collection (add to source)
  3. in both (update)
  4. in both (update)
  5. only in the comparing collection (add to source)
  6. only in the source collection (remove from source)

Here what the code looks like:

var source = new []{1, 3, 4, 6};
var collection = new[] {1, 2, 3, 4, 5};

source.CompareTo(collection, (s, d) => s == d)
    .OnlyInSourceCollection(s=> {/* do something */})
    .OnlyInComparingCollection(s=>{/* do something */})
    .InBoth(s=> {/*do something*/})
    .Process();

Why not use LINQ?

You can use LINQ, however, LINQ will iterate the collections at least 3 times which doesn’t include operating (adding, updating, and deleting) on the data. Using the CollectionComparer, the data is only iterated twice.

There are faster ways to find the differences such as a binary search, but a binary search only works with integers. The collection comparer supports any type of comparison. The comparison is defined with this code: (s, d) => s == d.

The source code is found on GitHub.

Implementing Request Caching in ASP.Net Core

At some point in an application’s development, usually, fairly early on, you realize the application is slow. After some research, the culprit is, unnecessarily retrieving the same data, and a light goes off, and you think: “I need some caching.”

Caching is an invaluable pattern for eliminating redundant calls to a database or a third party API. Microsoft provides IMemoryCache for time-based caching, however sometimes time-based caching isn’t what you need. In this article, we look at Request Scoped caching and how it can benefit us.

What is Request caching? Request caching is a mechanism to cache data for the life of a web request. In dot-net, we’ve had this ability in some capacity with the HttpContext.Items collection, however, HttpContext is not known for its injectability.

Request Scoped caching has a few benefits: First, it eliminates the concern of stale data. In most scenarios, a request executes in less than a second and which typically isn’t long enough for data to become stale. And secondly, expiration isn’t a concern because the data dies when the request ends.

Out of the box, Asp.Net Core doesn’t have injectable caching. As mentioned earlier, HttpContext.Items is an option, but it’s not an elegant solution.

Luckily for us, ASP.Net Core gives us the tools to create an injectable Request Caching implementation by using the built-in dependency injection (DI) framework.

The built-in DI framework has three lifetimes for dependencies: Singleton, Scoped, and Transient. Singleton is for the life of the application, Scoped is for the life of the request and Transient is a new instance with each request.

I’ve created an interface modeled after the IMemoryCache interface to keep things consistent.

Interface

public interface IRequestCache
{
    /// <summary>
    /// Add the value into request cache. If the key already exists, the value is overwritten.
    /// </summary>
    /// <param name="key"></param>
    /// <param name="value"></param>
    /// <typeparam name="TValue"></typeparam>
    void Add<TValue>(string key, TValue value);

    /// <summary>
    /// Remove the key from the request cache
    /// </summary>
    /// <param name="key"></param>
    void Remove(string key);

    /// <summary>
    /// Retrieve the value by key, if the key is not in the cache then the add func is called
    /// adding the value to cache and returning the added value.
    /// </summary>
    /// <param name="key"></param>
    /// <param name="add"></param>
    /// <typeparam name="TValue"></typeparam>
    /// <returns></returns>
    TValue RetrieveOrAdd<TValue>(string key, Func<TValue> add);

    /// <summary>
    /// Retrieves the value by key. When the key does not exist the default value for the type is returned.
    /// </summary>
    /// <param name="key"></param>
    /// <typeparam name="TValue"></typeparam>
    /// <returns></returns>
    TValue Retrieve<TValue>(string key);
}

Implementation

public class RequestCache : IRequestCache
{
    IDictionary<string, object> _cache = new Dictionary<string, object>();

    /// <summary>
    /// Add the value into request cache. If the key already exists, the value is overwritten.
    /// </summary>
    /// <param name="key"></param>
    /// <param name="value"></param>
    /// <typeparam name="TValue"></typeparam>
    public void Add<TValue>(string key, TValue value)
    {
        _cache[key] = value;
    }

    /// <summary>
    /// Remove the key from the request cache
    /// </summary>
    /// <param name="key"></param>
    public void Remove(string key)
    {
        if (_cache.ContainsKey(key))
        {
            _cache.Remove(key);
        }
    }

    /// <summary>
    /// Retrieve the value by key, if the key is not in the cache then the add func is called
    /// adding the value to cache and returning the added value.
    /// </summary>
    /// <param name="key"></param>
    /// <param name="add"></param>
    /// <typeparam name="TValue"></typeparam>
    /// <returns></returns>
    public TValue RetrieveOrAdd<TValue>(string key, Func<TValue> add)
    {
        if (_cache.ContainsKey(key))
        {
            return (TValue)_cache[key];
        }

        var value = add();

        _cache[key] = value;

        return value;
    }

    /// <summary>
    /// Retrieves the value by key. When the key does not exist the default value for the type is returned.
    /// </summary>
    /// <param name="key"></param>
    /// <typeparam name="TValue"></typeparam>
    /// <returns></returns>
    public TValue Retrieve<TValue>(string key)
    {
        if (_cache.ContainsKey(key))
        {
            return (TValue)_cache[key];
        }

        return default(TValue);
    }
}

Using ASP.Net Core’s DI framework we’ll wire it up as Scoped.

services.AddScoped<IRequestCache, RequestCache>();

Usage

public class UserService
{
    private readonly IRequestCache _cache;
    private readonly IUserRepository _userRepository;

    public UserService(IRequestCache cache, IUserRepository userRepository)
    {
        _cache = cache;
        _userRepository = userRepository;
    }

    public User RetrieveUserById(int userId)
    {
        var buildCacheKey = UserService.BuildCacheKey(userId);

        return _cache.RetrieveOrAdd(BuildCacheKey, () => { return _userRepository.RetrieveUserBy(userId); });
    }

    public void Delete(int userId)
    {
        var buildCacheKey = UserService.BuildCacheKey(userId);

        _userRepository.Delete(userId);
        _cache.Remove(BuildCacheKey(userId));
    }

    private static string BuildCacheKey(int userId)
    {
        return $"user_{userId}";
    }
}

That’s it! Request Caching is now injectable in any place you need it.

Visit the Git Repository and feel free to take the code for a spin.

Workaround for ‘Template parse errors’ in Angular

This was one of the more frustrating issues with Angular 2/4/+. It’s not an issue with Angular 2/4/+ per se, but with how webpack bundles the supporting HTML files.

This issue happens when you run webpack with the production flag (webpack -p). The compilation completes, but running the site generates a runtime error: Template parse errors

[generic]
Uncaught Error: Template parse errors:
Unexpected closing tag “a”. It may happen when the tag has already been closed by another tag. For more info see https://www.w3.org/TR/html5/syntax.html#closing-elements-that-have-implied-end-tags (“ Home

[/generic]

The suspected HTML:

[html]

[/html]

I couldn’t find anything wrong with this HTML. I ran it through multiple online HTML validators, and the HTML always came back as valid HTML. After a few weeks of beating my head against the wall and getting nowhere, I figured there must be a way around this issue.

What I discovered, via a GitHub thread (sorry I lost the link to it) was it’s the HTML loader trying to minimize the HTML that’s causing the problem. If the HTML minimization is turned off, this issue goes away. To be fair, most folks with this error had invalid HTML. I encourage you to check your HTML before working around this issue, you may be just pushing the issue down the road.

Turning off HTML minimization is easy as updating your HTML-loader in the webpack.config.js.

Before:
[js]
{
test: /.html$/,
loader: ‘html-loader’
},
[/js]

After:
[js]
{
test: /.html$/,
use: [{
loader: ‘html-loader’,
options: {
minimize: false,
removeComments: false,
collapseWhitespace: false,
removeAttributeQuotes: false,
}
}],
},
[/js]

In a Single Page Application, Should I process on the Client or the Server?

One of the selling points of the Single Page Application (SPA) was offloading work traditionally performed on the server onto the client. I feel the SPA has delivered on this promise.

However, it’s not all roses. It’s easy to get overzealous and push too much work to the client. We often forget that we can’t control the client’s environment — the client can be anything from a ten-year-old machine to the latest-and-greatest smartphone with access to varying internet speeds. The only thing we can count on is the user viewing our site in a browser.

Processing

In my experience, intensive processing and data needing consistency between the server and the client, such as date conversions or data requiring precise calculations, such as money, are prime candidates for server-side rendering.

Paging

A common task performed on the client is paging. With small datasets this works great; however, small datasets never stay small. As data grows, the application slows and eventually becomes unresponsive. Unfortunately, you don’t know it’s happening because it’s on the client and even worse it doesn’t occur on all clients making it hard to troubleshoot.

Moving paging from the client to the server will alleviate paging related performance issues on the client. You might be thinking, “but now I am making a ton of API calls. Making so many API calls doesn’t seem optimal.” True, it does seem that way, but you’ll be surprised how fast your data returns from the server. And the best part? You control the server and can increase capacity as needed.

At the end of the day, you want to provide the user a rich responsive experience and sometimes that’s letting the server do the heavy lifting.

Summary

Summarizing, when possible we want to offload work to the client, but by doing so, we can quickly overburden the client. Keeping arduous tasks such as paging and intensive processing on the server can protect us from overwhelming the client.