## Mathematica Programming » Performance Tuning

## Packed Arrays

Mathematica has a wide variety of low-level optimizations that we never run into in our high-level usage. One of the most useful of these is the concept of the `PackedArray`

.

These are efficiently stored arrays of a single type of object that perform much faster on many simple numerical manipulations than their "unpacked" companions.

Most internal functions that return arrays will return them in packed form. We can check that an array is packed with `Developer`PackedArrayQ`

:

` ````
reals = RandomReal[{-1, 1}, {1000, 1000}];
Developer`PackedArrayQ@reals
```

` ``True`

The memory footprint of these is much lower than standard arrays:

` ````
ureals=Developer`FromPackedArray@reals;
Developer`PackedArrayQ@ureals
ByteCount@reals
ByteCount@ureals
```

` ``False`

` ``8000152`

` ``24208200`

And operations that can return a packed array are faster than their unpacked variants:

` ````
1+reals//RepeatedTiming//First
1+ureals//RepeatedTiming//First
```

` ``0.0024543400000000002`3.`

` ``0.0099898333333333315`2.`

### Unpacking

The biggest thing to worry about with packed arrays is the time/memory footprint that occurs when a packed array must be unpacked to be processed by a function that can't handle it in packed form.

For instance the function `Chop`

necessarily unpacks as it replaces the `Real`

number `0.`

with the `Integer`

number `0`

. This means `Chop`

will actually be *slower* on a packed array:

` ````
Chop@reals//RepeatedTiming//First
Chop@ureals//RepeatedTiming//First
```

` ``0.1129493750000000046`2.`

` ``0.052316666666666671`2.`

But if we use a function that can leverage the packing of the array it will be faster. For instance, let's look at the `Real`

variant of `Chop`

, a function called `Threshold`

:

` ````
Threshold@reals//RepeatedTiming//First
Threshold@ureals//RepeatedTiming//First
```

` ``0.0030930624999999999`2.`

` ``0.0107768695652173899`2.`

The packed version is much faster now.

## RawArray

The `RawArray`

is a relatively new addition to the language. It works rather like `PackedArray`

in its requirement ot use a single type of object in it, but is a fundamentally different object.

It's main benefit is how little memory it consumes:

` ``reals//ByteCount`

` ``8000152`

If we use a 64-bit real we get similar memory consumption to the packed array:

` ````
raw64=RawArray["Real64", reals];
raw64//ByteCount
```

` ``8000096`

But we can specify that it out to use half as many bytes, as the memory usage will drop correspondingly:

` ````
raw32=RawArray["Real32", reals];
raw32//ByteCount
```

` ``4000096`

These can also be used more effectively via Library Link but we won't get into that now. For many data-intensive custom `Import`

formats `RawArray`

can help cut down on the memory consumption, particularly where some large portions of the data may not be immediately useful.

## SparseArray

We won't discuss `SparseArray`

much, as the sparse array is a well-known concept especially in numerical linear algebra.

It is worth keeping in mind, though, for places where one has highly sparse data. One classic place this can show up is in quantum mechanical calculations in a direct-product basis constructed from orthonormal bases. Here often one will obtain expressions that looks like

` ``H[n, m, i, j] = H1*δ[i, j] + H2*δ[n, m]`

Assuming `N1`

elements in the first basis and `N2`

in the second, this can be more efficiently represented as:

` ``H = KroneckerProduct[H1, IdentityMatrix[N2]] + KroneckerProduct[IdentityMatrix[N1], H2]`

And now if we visualize this with `H1`

and `H2`

being odd-looking matrices which arise in this context:

` ````
H1=Table[(1/2)(-1)^(i-j)If[i==j, π^2/3, 2/(i-j)^2], {i, 50}, {j, 50}];
H2=Table[((-1)^(i-j)/(2*(.05)^2))If[i==j, π^2/3, 2/(i-j)^2], {i, 75}, {j, 75}];
H1//MatrixPlot
```

We can directly construct `H`

in a the standard fashion:

` ````
H =
KroneckerProduct[H1, IdentityMatrix[Length@H2]] +
KroneckerProduct[IdentityMatrix[Length@H1], H2];//AbsoluteTiming//First
```

` ``4.523075``

It takes a while and we see it has a large memory footprint:

` ``H // ByteCount`

` ``340260728`

But if we visualize it we see it's most zeros:

` ``H // MatrixPlot`

If we build it instead use `SparseArray`

nothing will be done with that vast field of zeros, so it'll be faster and less memory intensive:

` ````
HS =
KroneckerProduct[H1, IdentityMatrix[Length@H2, SparseArray]] +
KroneckerProduct[IdentityMatrix[Length@H1, SparseArray], H2];//AbsoluteTiming//First
HS//ByteCount
```

` ``0.118826``

` ``11191512`

And even better, when we ask for its smallest 5 `Eigenvalues`

these will be computed faster, and increasingly so with matrix size:

` ````
Eigenvalues[H, -5]//AbsoluteTiming
Eigenvalues[HS, -5]//AbsoluteTiming
```

` ``{4.943048`,{0.3917953580905429`,0.3745580141408224`,0.3611510903074643`,0.35157483336542145`,0.34582905618130677`}}`

` ``{3.733548`,{0.39179535809072363`,0.37455801414116613`,0.3611510903079675`,0.35157483336567635`,0.34582905618161475`}}`