Twitter: @citizenmatt
GitHub: https://github.com/citizenmatt/RefSemantics
Always be measuring – don’t do it because you think it’s the right thing to do. It might not be.
Managed memory is cheap. That’s not true, allocation is cheap and garbage collection is expensive. Garbage collection stops the world while things are cleaned up and optimized. Sometimes memory cleanup can impact the performance of the application.
Performance Low Hanging Fruit
Object reuse
Object pooling. Pass in existing array rather than allocating a new one. Preallocate array length.
String concatenation.
Use StringBuilder
, Preallocate length if possible.
params arguments
void MyParamsMethod(params string[] args) { //... }
In Code:MyParamsMethod("Hello", "World");
MyParamsMethod();
Complier:MyParamsMethod(new {"Hello", "World"});
MyParamsMethod(new [0]); // this creates a new array object in memory
Instead, do this in code:MyParamsMethod(new {"Hello", "World"});
MyParamsMethod(Array.Empty); // reuses an existing empty string array in memory. It's the same idea behind string.Empty
Suggestion: Introduce overloads with common number of arguments.
Boxing
Boxing creates a new object on the heap. Now you have two memory locations with the same value. Changing either value does not impact the other value.
Suggestion: Introduce generic overloads
Closures
The complier converts closures to classes. Captured values are passed in as constructor parameters. This class is then allocated to the heap.
Suggestion: Avoid critical paths. Pass state as argument to lambda. Investigate local functions. Since local functions lifetime is known, it can be allocated on the stack where allocation and cleanup are cheap.
LINQ
Lambda expressions are treated the same as Closures. They are allocated as a class to the heap. Because much of LINQ is based on static methods, additional allocations of Iterators and IEnumerable happen.
Suggestion: Avoid critical paths. Use good old foreach and if statements
Iterators
iterators are rewritten as a state machine, which means more allocations to the heap.
Suggestion: Return a collection. Be aware of the cost.
async/await
Async/await
also, generate a state machine. Task and Task also trigger more allocations which then can’t be reused.
Look into ValueTask for common uses cases.
Suggestion: Investigate ValueTask
.
Heap Vs Stack
Each method pushes space onto the stack for local variables. When the method is exited the memory is popped from the stack.
Stack allocation and cleanup are cheap because the memory has a lifetime. However, stack space is limited.
*Why passing value types copies the data versus passing a references it’s important to note that copying data isn’t that expensive.
Reference Semantics with Value Types
Allows value types to be used like reference types
– pass by reference everywhere
User value types to reduce allocations, reduce memory traffic, etc
– Throughput!
Pass by reference to avoid copies, enables modifying, etc.
Very low-level micro-optimizations…
– But they’ll be used in the platform…
– (And games, and parsing, and serialization, and …)
*These are low level optimizations that should only be used in paths were performance matters.
C# 7.2 Reference Semantics with Value Types
Allocating a reference type has a cost, but passing it around is cheap.
Allocating a value type is cheap, but passing it around has a cost.
Why can’t it be cheap to allocate AND cheap to pass around?
in parameters
Pass value type by reference. Called methods cannot modify it.
Method argument modifier
Complements out
and ref
Passed by reference
Method cannot modify original value
Compiler enforces safety with defensive copy when calling members
ref locals and ref returns (C# 7.0)
ref returns
Returns a reference to value the, not a copy of the value.
– return type of method becomes e.g. integer reference int&
in IL
Lifetime of returned value must exceed the lifetime of the called method
- e.g. a reference to a field or method argument. NOT a variable in the called method.
- Not allowed on
async
methods.
Modifying this reference is the same as modifying the original value
- e.g. return reference to array element, and update it in place.
Add ref
modifier to method declaration return type, and to return
statement
ref locals
Assign a ref return to a new variable will create a copy
- The variable is a value type, not a reference. (Cannot assign
int&
toint
) - A ref local is a variable that is a reference to a value type
- Accessing the variable accesses the original value
Use a ref local to store the ref return result
Type inference with var
will get the value type, not the ref
modifier
- Requires
ref var
to work as expected.
ref readonly returns
Returns a read only value type by reference
readonly struct
Immutable value types
in
parameters and ref readonly
can create defensive copies
- The compiler doesn’t know if the struct’s methods will modify state
readonly struct
– compiler enforces all fields and properties are readonly
Immutable
More efficient – no copies made when calling members
- Improves performance (micro-optimization)
ref struct
Stack only value types
Declare a value type that can only be stack allocated
- I.e can never be part of a reference type
This constrains lifetime to calling method
- Also, cannot be boxed, cannot use inside a non-ref struct
- Cannot use with async methods or iterators
- Cannot be a generic parameters
Limited use cases
- Working with
stackalloc
memory - Primarily for
Span<T>
What does SPAN have to do with ref struct?
For thread safety, need to update all fields of Span atomically (tearing)
- Whole point is performances – cannot use synchronization
Internal pointers require special GC tracking
- Too many in flight at once is expensive
How can SPAN represent stackalloc
memory is SPAN was on the heap?
Solution: Span<T>
is a ref struct
– can only be created on the stack
- Constrained lifetime, single thread access
Span<T>
New type of unify working with any kind of contiguous memory
- Arrays, array segments, strings and substrings, native memory,
stackalloc
, etc
Provides array-like API – indexer
- ReadOnlySpan provides getter indexer only
Type safe – each elements is of type T
Array-like performance
- Not quite, but newer runtimes have special support
Slicing
- Create a new Span with a sub-section of existing – without allocations!
Span<T> Implementation
Value Type – struct
System.Memory NuGet package
- .NET Standard 1.1 (.NET Framework 4.5)+
New APIs and overloads in the BCL
- E.g.
String.AsSpan()
,Stream.ReadAsync(),
Utf8Parser.TryParse()
- Significant usage of ref semantics – allocation free!
Span, ReadOnlySpan
, Memory
Two versions – “portable” and “fast”
- fast requires runtime support
Span<T> Performance – Portable Implementation
Portable works on .NET Standard 1.1 and above
- .Net Framework 4.5+
Portable is not slow
- But not as fast as arrays
Three fields – object reference, internal offset and length
- Slightly larger than fast version, dereferencing is slightly more complex operation.
Span<T> Performance – Fast Implementation
Fast requires runtime support
- .Net Core 2.1
Only has two fields – “byref” internal pointer and length
- Slightly smaller struct and accessing an element is slightly simpler operation
Specific JIT optimizations
- e.g elding bounds check in loop, like arrays
Very close to array performance