3 Reasons Why Code Reviews are Important

swordsharpeningA great code review will challenge your assumptions and give you constructive feedback. For me, code reviews are an essential part in growing as a software engineer.

Writing code is an intimate process. Software engineers spend years learning the craft of software engineering and when something critical is said of our creation it’s hard not to take it personal. I find myself, at times, getting defensive when I hearing criticisms. I know the reviewer means well, but this isn’t always comforting. If it wasn’t for honest feedback from some exceptional software engineers, I wouldn’t be half the software engineer I am today.

Benefits of Code Reviews

1. Finding Bugs

Sometimes it’s the simple fact of reading the code that you find an error. Sometimes it’s the other developer who spots the error. Regardless, simply walking the code is enough to expose potential issues.

I think of my mistakes as the grindstone to my sword. To quote Michael Jordan:

I’ve missed more than 9000 shots in my career. I’ve lost almost 300 games. 26 times, I’ve been trusted to take the game winning shot and missed. I’ve failed over and over and over again in my life. And that is why I succeed.

2. Knowledge Transfer

Sharing your work with others is humbling. In many ways you are the code. I know that I feel vulnerable when I share my code.

This a great opportunity to learn from and to teach other engineers. In sharing your code you are taking the reviews on a journey, a journey into the code and aspects about you. A lot can be learned about you by how your write code.

At the end of the code review the reviewers should have a good understanding of how the code works, the rationale behind it and will have learned a little bit about you.

3. Improving the Health of the Code

As I mentioned, the more times you read the code the better code becomes. The more reviewers the better the chance one of them will suggest an improvement. Some might think skill level matters, it doesn’t. Less experienced software engineers don’t have the deep technological knowledge as experienced software engineers, but they also don’t have to wade through all the mental technical baggage to see opportunities for improvement.

Code reviews gives us the benefit of evaluating our code. There will always be something to change to make it just a little bit better.

Coding, in this way, is much like writing. For a good piece to come into focus the code must rest and be re-read. The more times you repeat this process the better the code will become.

In Closing

Some companies don’t officially do code reviews, that’s ok. Seek out other engineers. Most software engineer’s will be happy to take 10 to 15 minutes to look over your code.

Algorithms: Binary Search

You are presented with a set of a 1000 numbers. You are tasked with finding the position of 73. The most obvious approach is to started with the first number and evaluate every number until 73 is found. This approach is called a linear search algorithm or sequential search algorithm. This works for a set of 1000 numbers, but consider if the set is increased to 10 million numbers. A linear search can not scale and is simply not suited for this many numbers, but a binary search algorithm can.

A binary search algorithm requires the data to be sorted. Once sorted, the value is found by finding the middle value and comparing it to the search value. If the search value is lower than the middle value, we take the first half of numbers and again we find the middle value and once again we compare it to the search value. If the search value is still lower, we again take the first half of numbers and find the middle value. And once again we compare the middle value to the search value. This process repeats itself until we find the search value or we run out of values.

Comparing the two algorithms for performance: A linear search of 10 million numbers, assuming 1 second per number, will consume roughly 116 days. A binary search of 10 million numbers, again assuming 1 second per number, will only consume about 23 seconds. When searching for numbers the binary search wins hands down.

Binary Search implemented in C#:


        public int BinarySearch(int number, int[] collection)
        {
            int low = 0;
            int high = collection.Length - 1; //collection[]; // find the last number, this assumes the collection is sorted.

            while (low <= high)
            {
                int mid = (low + high) / 2;

                if (collection[mid] < number)
                {
                    low = mid + 1;
                }
                else if (collection[mid] > number)
                {
                    high = mid - 1;
                }
                else
                {
                    return collection[mid];
                }
            }

            return -1;
        }

5 Steps for Coding for the Next Developer

codingsmallMost of us probably don’t think about the developer who will maintain our code. Until recently, I did not consider him either. I never intentionally wrote obtuse code, but I also never left any breadcrumbs.

Kent Beck on good programmers:

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.

Douglas Crockford on good computer programs:

It all comes down to communication and the structures that you use in order to facilitate that communication. Human language and computer languages work very differently in many ways, but ultimately I judge a good computer program by it’s ability to communicate with a human who reads that program. So at that level, they’re not that different.

Discovering purpose and intent is difficult in the most well written code. Any breadcrumbs left by the author, comments, verbose naming and consistency, is immensely helpful to next developers.

I start by looking for patterns. Patterns can be found in many places including variables names, class layout and project structure. Once identified, patterns are insights into the previous developer’s intent and help in comprehending the code.

What is a pattern? A pattern is a repeatable solution to a recurring problem. Consider a door. When a space must allow people to enter and to leave and yet maintain isolation, the door pattern is implemented. Now this seems obvious, but at one point it wasn’t. Someone created the door pattern which included the door handle, the hinges and the placement of these components. Walk into any home and you can identify any door and it’s components. The styles and colors might be different, but the components are the same. Software is the same.

There are known software patterns to common software problems. In 1995, Design Patterns: Elements of Reusable Object-Oriented Software was published describing common software patterns. This book describes common problems encountered in most software application and offered an elegant way to solve these problems. Developers also create their own patterns while solving problems they routinely encounter. While they don’t publish a book, if you look close enough you can identify them.

Sometimes it’s difficult to identify the patterns. This makes grokking the code difficult. When you find yourself in this situation, inspect the code, see how it is used. Start a re-write. Ask yourself, how would you accomplish the same outcome. Often as you travel the thought process of an algorithm, you gain insight into the other developer’s implementation. Many of us have the inclination to re-write what we don’t understand. Resist this urge! The existing implementation is battle-tested and yours is not.

Some code is just vexing, reach out to a peer — a second set of eyes always helps. Walk the code together. You’ll be surprised what the two of you will find.

Here are 5 tips for leaving breadcrumbs for next developers

1. Patterns
Use known patterns, create your own patterns. Stick with a consistent paradigm throughout the code. For example, don’t have 3 approaches to data access.

2. Consistency
This is by far the most important aspect of coding. Nothing is more frustrating than finding inconsistent code. Consistency allows for assumptions. Each time a specific software pattern is encountered, it should be assumed it behaves similarly as other instances of the pattern.

Inconsistent code is a nightmare, imagine reading a book with every word meaning something different, including the same word in different places. You’d have to look up each word and expend large amounts of mental energy discovering the intent. It’s frustrating, tedious and painful. You’ll go crazy! Don’t do this to next developer.

3. Verbose Naming
This is your language. These are the words to your story. Weave them well.

This includes class names, method names, variable names, project names and property names.

Don’t:

if(monkey.HoursSinceLastMeal > 3)
{
FeedMonkey();
}

Do:

int feedInterval = 3;
if(monkey.HoursSinceLastMeal > feedInterval)
{
FeedMonkey();
}

The first example has 3 hard coded in the if statement. This code is syntactically correct, but the intent of the number 3 tells you nothing. Looking at the property it’s evaluated against, you can surmise that it’s really 3 hours. In reality we don’t know. We are making an assumption.

In the second example, we set 3 to a variable called ‘feedInterval’. The intent is clearly stated in the variable name. If it’s been 3 hours since the last meal, it’s time to feed the monkey. A side effect of setting the variable is we can now change the feed interval without changing the logic.

This is a contrived example, in a large piece of software this type of code is self documenting and will help the next developer understand the code.

4. Comments
Comments are a double edge sword. Too much commenting increases maintenance costs, not enough leaves developers unsure on how the code works. A general rule of thumb is to comment when the average developer will not understand the code. This happens when the assumptions are not obvious or the code is out of the ordinary.

5. Code Simple
In my professional opinion writing complex code is the biggest folly among developers.

Steve Jobs on simplicity:

Simple can be harder than complex: You have to work hard to get your thinking clean to make it simple. But it’s worth it in the end because once you get there, you can move mountains.

Complexity comes in many forms, some of which include: future proofing, overly complex implementations, too much abstraction, large classes and large methods.

For more on writing clean simple code, see Uncle Bob’s book Clean Code and Max Kanat-Alexander’s Code Simplicity

Closing

Reading code is hard. With a few simple steps you can ensure the next developer will grok your code.

HTML Extended

There is an emerging trend to extend HTML. To the untrained eye the changes are not obvious.

For example, AngularJS extends the HTML with directives:

<my-directive my-data=”user”></my-directive>

To the browser this snippet of HTML is meaningless. Fortunately AngularJS has an internal compiler that converts it to meaningful HTML.

EmberJs is following suit with their upcoming 1.11 release. Web components will now appear as inline HTML. Like AngularJS, EmberJs will convert it to meaningful HTML.

After EmberJs release 1.11 markup:

<my-video src={{movie.url}}></my-video>

Starting with ASP.NET 5, ASP.NET will have “TagHelpers.” In the past to render HTML with Razor you’d do something like this:


<ul>
<li>@Html.ActionLink("Default", "Index", "Home")</li>
</ul>

Now with TagHelpers:


<ul>
<li><a controller="Default" action="Index">Home</a></li>
</ul>

This is an interesting trend. These technologies are extending HTML. At the end of the day it all must be rendered into meaningful HTML for the browser. I wonder if this is the end of custom view engines, such as Haml, Razor, Jade and EJS. It’ll be interesting to see how this plays out.

Iterate Small

Wielding RobotsIn the 1980’s manufacturing in the United States was in decline. After World War 2 the United States was the undisputed leader in manufacturing. During the 1960’s this changed. Japanese companies made the same products as the United States but their products were of a higher quality and cheaper. How did they do it? Many factors played a role, but one of factor was batch size. By lowering batch sizes Japanese companies increased quality and were able to deliver more product.

Lowering batch sizes allowed for a more nimble process. Instead having 20 tons of raw materials in process, they only needed 2 tons. When a batch was defective (it had bugs) only 2 tons of raw material were lost instead of 20 tons. The entire process was more efficient.

Applying this to software engineering, we want to develop small and deploy small.

Develop Small
Manufacturing is not an exact parallel to software development, but many of the principles are applicable. For instance, the more code changed, the more opportunity for bugs to manifest. Minimizing the number of changes lessens the likelihood of bugs.

Break tasks into small chunks. Even large feature can be split into small tasks. It’s fine to ship benign code.

Deploy Small
Deploying small requires a build and deployment process that can be confidently run multiple times a day.

Continual deployments allows for an evolving product. Nothing like upgrading from Windows 7 to Windows 8 ( for those that did not experience this, it was a drastic change). Small changes deployed in small increments. Less impact on the user and less opportunity for something to go wrong.

Implementing Transparent Encryption with NHibernate Listeners (Interceptors)

Have you ever had to encrypt data in the database? In this post, I’ll explore how using nHibernate Listeners to encrypt and decrypt data coming from and going into your database. The cryptography will be transparent to your application.

Why would you want to do this? SQL Server has encryption baked into the product. That is true, but if you are moving to the cloud and want to use SQL Azure you’ll need some sort of cryptography strategy. SQL Azure does not support database encryption.

What is an nHibernate Listener? I think of a Listener as a piece of code that I can inject into specific extensibility points in the nHibernate persistence and data hydration lifecycle.

As of this writing the following extensibility points are available in nHibernate.

  • IAutoFlushEventListener
  • IDeleteEventListener
  • IDirtyCheckEventListener
  • IEvictEventListener
  • IFlushEntityEventListener
  • IFlushEventListener
  • IInitializeCollectionEventListener
  • ILoadEventListener
  • ILockEventListener
  • IMergeEventListener
  • IPersistEventListener
  • IPostCollectionRecreateEventListener
  • IPostCollectionRemoveEventListener
  • IPostCollectionUpdateEventListener
  • IPostDeleteEventListener
  • IPostInsertEventListener
  • IPostLoadEventListener
  • IPostUpdateEventListener
  • IPreCollectionRecreateEventListener
  • IPreCollectionRemoveEventListener
  • IPreCollectionUpdateEventListener
  • IPreDeleteEventListener
  • IPreInsertEventListener
  • IPreLoadEventListener
  • IPreUpdateEventListener
  • IRefreshEventListener
  • IReplicateEventListener
  • ISaveOrUpdateEventListener

The list is extensive.

To implement transparent cryptography, we need to find the right place to encrypt and decrypt the data. For encrypting the data we’ll use IPostInsertEventListener and IPostUpdateEventListener. With these events we’ll catch the new data and the updated data going into the database. For decrypting, we’ll use the IPreLoadEventListener.

For this demonstration we’ll be using DatabaseCryptography class for encrypting and decrypting. The cryptography implementation is not important for this article.

IPreLoadEventListener

public class PreLoadEventListener : IPreLoadEventListener
{
readonly DatabaseCryptography _crypto = new DatabaseCryptography();

///
/// Called when [pre load].
///

///The event. public void OnPreLoad(PreLoadEvent @event)
{
_crypto.DecryptProperty(@event.Entity, @event.Persister.PropertyNames, @event.State);
}
}

IPreInsertEventListener

public class PreInsertEventListener : IPreInsertEventListener
{
readonly DatabaseCryptography _crypto = new DatabaseCryptography();

///
/// Return true if the operation should be vetoed
///

///The event. /// true if XXXX, false otherwise.
public bool OnPreInsert(PreInsertEvent @event)
{
_crypto.EncryptProperties(@event.Entity, @event.State, @event.Persister.PropertyNames);

return false;
}
}

IPreUpdateEventListener

public class PreUpdateEventListener : IPreUpdateEventListener
{
readonly DatabaseCryptography _crypto = new DatabaseCryptography();

///
/// Return true if the operation should be vetoed
///

///The event. /// true if XXXX, false otherwise.
public bool OnPreUpdate(PreUpdateEvent @event)
{
_crypto.EncryptProperties(@event.Entity, @event.State, @event.Persister.PropertyNames);

return false;
}
}

It’s important to note that on both IPreUpdateEventListener and IPreInsertEventListener must return false, otherwise the insert/update event will be aborted.

Now that we have the Listeners implemented we need to register them with nHibernate. I am using FluentNHibernate so this will be different if you are using raw nHibernate.

SessionFactory

public class SessionFactory
{
///
/// Creates the session factory.
///

/// ISessionFactory.
public static ISessionFactory CreateSessionFactory()
{
return Fluently.Configure()

.Database(MsSqlConfiguration.MsSql2012
.ConnectionString(c => c
.FromConnectionStringWithKey("DefaultConnection")))

.Mappings(m => m.FluentMappings.AddFromAssemblyOf())
.ExposeConfiguration(s =>
{
s.SetListener(ListenerType.PreUpdate, new PreUpdateEventListener());
s.SetListener(ListenerType.PreInsert, new PreInsertEventListener());
s.SetListener(ListenerType.PreLoad, new PreLoadEventListener());
})
.BuildConfiguration()
.BuildSessionFactory();
}

When decrypting and encrypting data at the application level it makes the data useless in the database. You’ll need to bring the data back into the application to read the values of the encrypted fields. We want to limit the fields that are encrypted and we only want to encrypt string values. Encrypting anything other that string values complicates things. There is nothing saying we can’t encrypt dates, but doing so will require the date field in the database to become a string(nvarchar or varchar) field, to hold the encrypted data, once we do this we lose the ability to operate on the date field from the database.

To identify which fields we want encrypted and decrypted I’ll use marker attributes.

Encrypt Attribute

public class EncryptAttribute : Attribute
{
}

Decrypted Attribute

public class DecryptAttribute : Attribute
{
}

To see the EncryptAttribute and the DecryptedAttribute in action we’ll take a peek into the DatabaseCryptography class.

DatabaseCryptography

public class DatabaseCryptography
{
readonly Crypto _crypto = ObjectFactory.GetInstance();

///
/// Encrypts the properties.
///

///The entity. ///The state. ///The property names. public void EncryptProperties(object entity, object[] state, string[] propertyNames)
{
Crypt(entity, propertyNames, s=>_crypto.Encrypt(s),state);
}

///
/// Crypts the specified entity.
///

///
///The entity. ///The state. ///The property names. ///The crypt. private void Crypt(object entity, string[] propertyNames, Func<string, string> crypt, object[] state) where T : Attribute
{
if (entity != null)
{
var properties = entity.GetType().GetProperties();

foreach (var info in properties)
{
var attributes = info.GetCustomAttributes(typeof (T), true);

if (attributes.Any())
{
var name = info.Name;
var count = 0;

foreach (var s in propertyNames)
{
if (string.Equals(s, name, StringComparison.InvariantCultureIgnoreCase))
{
var val = Convert.ToString(state[count]);
if (!string.IsNullOrEmpty(val))
{

val = crypt(val);
state[count] = val;
}

break;
}

count++;
}
}
}
}
}

///
/// Decrypts the property.
///

///The entity. ///The state. ///The property names. public void DecryptProperies(object entity, string[] propertyNames, object[] state)
{
Crypt(entity, propertyNames, s => _crypto.Decrypt(s), state);
}

}

That’s it. Now the encryption and decryption of data will be transparent to the application and you can go on your merry way building the next Facebook.

The Allure of Rewriting an Application

7304468436_ee00db57d5_zMost Software Engineers have at sometime in their careers advocated for a rewrite. This is a Software Engineer’s utopia. I admit, there was a time I was that software engineer. Thankfully those days are behind me.

Joel Spolsky explains the merits of rewriting software:

…the single worst strategic mistake that any software company can make.

That seems pretty harsh. Is it really that bad to toss the old and write anew?

Joel Spolsky has a thought on why software engineers want to rewrite code:

It’s harder to read code than to write it.

How many times have you read code and thought “What the hell were they thinking?” Worse yet, you’ve said it aloud. To understand code, you must mentally compile it. This is really hard. The author might be a novice, speak a different language or be an experienced coder. Heck the author could be you!

Have you ever read code and wondered why something was written a particular way? You rewrite it, only to discover why it was written in that particular way. Each line of code holds bit of knowledge. Sometimes this knowledge is hard fought. Have you ever chased a bug for weeks?

To be fair a rewrite may be the way to go, but most of the time it’s not.

Calling Stored Procedures with Code First

codeOne of the weaknesses of Entity Framework 6 Code First is the lack of support for natively calling database constructs (views, stored procedures… etc). For those who have not heard of or used Code-First in Entity Framework (EF), Code-First is simply a Fluent mapping API. The idea is to create all your database mappings in code (i.e. C#) and the framework then creates and track the changes in the database schema.

In traditional Entity Framework to call a stored procedure you’d map it in your EDMX file. This is a multi-step process. Once the process is completed a method is created, which hangs off the DataContext.

I sought to making a calling stored procedure easier. At the heart of a stored procedure you have a procedure name, N number of parameters and a results set. I’ve written a small extension method that takes a procedure name, parameters and a return type. It just works. No mapping the procedure and it’s parameters.


public static List<TReturn> CallStoredProcedure<TParameters, TReturn>(this DataContext context, string storedProcedure, TParameters parameters) where TParameters : class where TReturn : class, new()
{
IDictionary<string,object> procedureParameters = new Dictionary<string, object>();
PropertyInfo[] properties = parameters.GetType().GetProperties();

var ps = new List<object>();

foreach (var property in properties)
{
object value = property.GetValue(parameters);
string name = property.Name;

procedureParameters.Add(name, value);

ps.Add(new SqlParameter(name, value));
}

var keys = procedureParameters.Select(p => string.Format("@{0}", p.Key)).ToList();
var parms = string.Join(", ", keys.ToArray());

return context.Database.SqlQuery<TReturn>(storedProcedure + " " + parms, ps.ToArray()).ToList();
}

 

usage


var context = new DataContext();

List<User> users = context.CallStoredProcedure<object,User>("User_GetUserById", new{userId = 3});

Id’s, The Great Debate

Using ‘Id’ or NameId.3549285383_11de3317a6_z

I had a conversation with Rob Toyias on Id’s. The impetus was an existing product we were working on that did not identify primary and foreign keys consistantly. We spent a lot of time chasing down primary and foreign keys.

I was of the mindset that a table should have a primary key called “Id” and all foreign keys should be their respective tablenames. For example a foreign key from the User Table should be called UserId.

Rob disagreed. He thought that each id should be named consistently, even in the table where it originated. For example, in the user table, the primary key would be UserId.

My argument was if you have a table with 3 Id’s how do you know which one is the primary? He countered saying that the Id’s would be consistent across the database.

The conversation ended in a stalemate.

Fast-forward 6 months. I am on a different project. I’m working with Id’s in Javascript. I applied my same Id naming scheme I use in the database to my javascript. On the employee page the variable id refers to the employee id. All other id’s are their respective names of their pages (i.e. userId). This worked great. Until today. I won’t go too deep into the details, but what it came down to was sharing code between pages. The shared code assumes id is the id of the page, which it is, but Employee Id and User Id are not interchangeable.

After spending a number of hours fixing the Id debacle, I conceded to Rob’s point.

 

* photo reference