Tips for Efficient Coding in MATLAB

This tutorial assumes a general fluency and familiarity with MATLAB coding. If you have not had significant prio experience coding in MATLAB, refer to introductory tutorials (e.g. MATLAB Onramp) on the MATLAB and Simulink Training page

MATLAB offers some excellent tips on improving code preformance for scientific calculations. In particular, in most students' code, the most important concepts that should be understood to improve speed and clarity of code are:

  1. Vectorization
  2. Preallocating
  3. Conditional Indexing
1. Vectorization

MATLAB (short for MATrix LABoratory) is optimized from the ground up for matrix and vector calculations. As such, code will execute significantly faster if a problem can be constructed using matrix (or vector) computations rather than multiple nested loops. Combined with the following (conditional indexing), this also improves the clarity of code.

Consider, as a simplistic example, that we want to calculate the resonant frequency of any combination of inductance and capacitance in some range. Using nested for-loops, this would be implemented as

tic;					%start a timer
Ls = logspace(-9, -4, 1000);		%logarithmically spaced vector of 1,000 inductances from 1 nH to 0.1 mH
Cs = logspace(-12, -4, 10000);		%logarithmically spaced vector of 10,000 capacitances from 1 pF to 0.1 mF
for i = 1:length(Ls)
	for j = 1:length(Cs)
		fo(i,j) = 1/2/pi/sqrt(Ls(i)*Cs(j));
	end
end
toc;					%display elapsed time
				

After executing, fo is a 1,000 x 10,000 matrix of resonant frequencies. Due to the inclusion of tic and toc, when the code is executed the command line will display the computation time. On my computer,

Elapsed time is 2.874814 seconds.

for this script.

Using matrix computations, we instead formulate the problem such that matrix operations are used instead of nested for loops. For this application, the code becomes

tic;					%start a timer
Ls = logspace(-9, -4, 1000);		%logarithmically spaced vector of 1,000 inductances from 1 nH to 0.1 mH
Cs = logspace(-12, -4, 10000);		%logarithmically spaced vector of 10,000 capacitances from 1 pF to 0.1 mF
[C,L] = meshgrid(Cs,Ls);		%generate meshgrid of L and C values
fo = 1/2/pi./sqrt(L.*C);
toc;					%display elapsed time
				

This code produces the same matrix of results, fo. Again on my machine, the execution time of this code is

Elapsed time is 0.066403 seconds.

The code leverages a handy function, meshgrid(). In this implementation, [X,Y] = meshgrid(x,y) takes two vectors xand y and returns two 2-D matrices Xand Y of size length(y)-by-length(x). In X, every row is an identical copy of x and in In Y, every column is an identical copy of y. Thus, each of the $i^{th}$ row and $j^{th}$ column elements of Xand Y represent all unique combinations of the elements of each.

Once the meshgrid has been formed, fo is calculated using element-by-element matrix algebra. Note the use of .* when multiplying L and C. This operator is elementwise multiplication, rather than matrix multiplication; given the above discussion of meshgrid(), the computation L.*C results in a 1,000 x 10,000 matrix where the $i^{th}$ row and $j^{th}$ column value is the product of Ls(i) and Cs(j). The sqrt() function performs elementwise computations on matrices by default, and ./ is again used to perform elementwise divison.

2. Preallocation

If the nested for-loop implementation from the previous section is run a second time, without clearing the workspace, you may find that it runs much faster. On my machine, for a second run,

Elapsed time is 0.146933 seconds.

which is still slower than the vectorized implementation, but much faster than the original execution. The primary cause of this faster runtime on the second execution is that the variable fo has already been allocated space in memory.

On the original execution of the nested for-loop code, the size of matrix fo increases with successive iterations. For the first 10,000 iterations of the inner loop, fo increases from 1x1 to 1x10,000 each iteration. Then, for each iteration of the outer loop, fo increases from 1x10,000 to 1,000 x 10,000. Each time the size of fo increases, MATLAB needs to ensure that enough physical memory on the executing machine is dedicated to the storage of fo. This requires that fo be constantly moved around in memory with new size, slowing execution time considerably.

The meshgrid implementation does not have this issue. The call to meshgrid() creates matrices L and C which are 1,000 x 10,000 initially, and when fo is defined, it is already known that it will be the same size, so each variable is allocated memory only once.

Neglecting the benefits of vectorization from the previous section, the preallocation problem in the nested for-loop script can be solved by first preallocating the proper size of memory for fo. This can be done by adding the statement

fo = zeros(1000,10000);

anywhere prior to the for-loop.

3. Conditional Indexing

In matlab, matrices and vectors can be indexed in different manners. In the most straightfoward case, fo(i,j), where i and j are integers within the range of the size of the matrix, will return the element in the $i^{th}$ row and $j^{th}$ column of matrix fo(i,j). Additionally, fo(i) will return the $i^{th}$ element in the vector created when the matrix is traversed first by column, then by row (then by and additional dimensions). For our 1000 x 10,000 matrix, fo(12000) will return the same value as fo(1000,12) or fo(end,12).

Going further, using the latter method, we can index multiple elements to extract them as a vector. As an example, fo([1, 12000, 12005]) will return a vector containing [fo(1,1), fo(1000,12), fo(5,13)]

In a final method, which is the focus of this section, fo can be indexed by a binary matrix $A$ of the same dimension as fo, resulting in a vector of all values at indices where $A$ contains a "1". This approach gives rise to the possibility of conditional indexing.

As an example, assume we want to find the combinations of L and C from the previous calculation that give rise to a resonant frequency above 5 GHz. We could do this inefficiently through the nested for-loop technique which assesses each entry one-by-one and stores only those that meet the criterion. Alternately, we could note that

fo >5e9

returns a binary matrix, of the same size as fo, in which there is a "1" in any location where the corresponding frequency is greater than 5 GHz and a zero elsewhere. Then, we can use this matrix as an index into L and C to extract the corresponding values. Thus,

[L(fo>5e9), C(fo>5e9)]

returns the 9 (in this case) paris of inductance and capacitance within our range that result in a resonant frequency above 5 GHz. Their corresponding resonant frequencies are

fo(fo>5e9)

The approach can be used with additional relational or logical operators. Suppose we want to find all L-C pairs which result in a resonant frequency within 1MHz $\pm$ 1kHz and with a capacitance less than 260 pF. Then

loc = abs(fo-1e6)<1e3 & C<.26e-9;
[L(loc), C(loc)]

returns the six L-C pairs within our set that satisfy these criteria.

Concluding Remarks

The above examples are intentionally simplistic to show the mechanisms at play. Clearly, finding the resonant frequency of a pair of L-C components could be done in a much more efficient manner than iterating all possibilities numerically. Finding where vectorization and conditional indexing can be used to improve performance and clarity of your code requires practice and experience. However, correct use of these techniques can allow code execution time to reduce from tens of minutes to a few seconds in many complex system analysis scenarios.