Category "Optimization"


Atle Borsholm recently posted a clever solution for finding the n-th smallest element in an array on the IDL Data Point. He compares this to a naive solution which simply sorts all the elements and grabs the n-th element:

IDL> tic & x = ordinal_1(a, 123456) & toc
% Time elapsed: 3.0336552 seconds.

His solution performs much better:

IDL> tic & x = ordinal_2(a, 123456) & toc
% Time elapsed: 0.46286297 seconds.

I have a HISTOGRAM-based solution called MG_N_SMALLEST in mglib that can do even better:

IDL> tic & x = mg_n_smallest(a, 123456) & toc
% Time elapsed: 0.18394303 seconds.

Note: MG_N_SMALLEST does not return the n-th smallest element directly, but returns indices to the smallest n elements.

I have a more detailed description of what MG_N_SMALLEST is doing in an older article. I like this routine as a good example of using HISTOGRAM and its REVERSE_INDICES keyword. It also a nice example of when using a FOR loop in IDL isn’t so bad.

I include even more detail on this routine in the “Performance” chapter of my book.

Arrays of non-negative integers are encountered frequently in IDL, e.g., for index arrays and lists of ENVI file IDs to name a couple examples. Since valid data elements are non-negative integers, -1 is used to indicate an empty set. I have a few routines for efficiently computing set unions, intersections, differences, and complements.

For example, suppose I have found the valid data in an image with:

IDL> valid_ind = where(band gt 0.)

And then I find indices of the clouds in the image using my special routine FIND_CLOUD_INDICES:

IDL> noncloud_ind = find_cloud_indices(band)

Now, if I want to find the indices of the pixels of the image that are valid and not cloud pixels, I can use my MG_SETINTERSECTION routine:

IDL> valid_ind = mg_setintersection(valid_ind, noncloud_ind)

Download the routines or look at the documentation.

A pre-release of GPULib was made available today. GPULib is a library of common scientific computational routines that use modern graphics processing units (GPUs) to improve performance. I saw speedups of 5-50 times faster than normal IDL code for example code I was writing. Vectorized code that is still not fast enough can be improved with GPULib.

Bindings for Java, Python, MATLAB, and IDL are available. The IDL bindings are fairly simple to use and have a compatibility mode that allows running the code even without a GPU (performance was approximately equivalent to normal IDL code for the examples I wrote).

The documentation lists all the routines available (IDLdoc 3.0b3 spotted in the wild!).

Full disclosure: I work for Tech-X Corporation and worked on the IDL bindings and examples for GPULib.

FastDL 2.4 was released today. FastDL provides a way to do cluster computing from within IDL in two different flavors: mpiDL and TaskDL. For problems which require communication between nodes, mpiDL provides an interface to the MPI standard. If you are familiar with MPI, then mpiDL should be very easy to use. It can now be used with MPICH, MPICH2, or OpenMPI implementations of MPI. For problems where nodes do not need to communicate with each other, TaskDL is a task farming solution for IDL.

Full disclosure: I work for Tech-X Corporation and did some work on this project.

Recently I had to find the neighbors in a 3 by 3 by 3 grid around a given point in 3D space. J.D. Smith’s Dimensional Juggling Tutorial has a lot of similar use of the REFORM and REBIN routines, but mostly for the purpose of “inflating” a dimension of an array. This problem called for cyclically repeating the same values in different patterns.

Continue reading “Local grid points.”

For a recent project, I needed to compute a local mean and standard deviation for an array. Given a window size, the local mean for a given pixel location is the mean of all the elements around that pixel in the window. So the local mean is an array the same size as the original image. The local mean is simple to calculate:

kernel = fltarr(width, width) + 1.0
local_mean = convol(image, kernel) / width^2

for a window size of width. But I wasn’t sure if I could calculate the local standard deviation or other moments beyond the mean (without a loop) since they have a power of xj - meani term. It turns out that you can with some tedious algebra; it’s fairly straightforward to calculate.

The IDL routine MOMENT can also calculate mean absolute deviation. I haven’t been able to come up with a way to compute a local mean absolute deviation without a loop.

Here is mg_local_moment.pro (doc).

Find the n smallest elements of a data set Finding the n smallest elements of an array is a common problem with an informative solution. It is not necessary to sort the entire array to solve this problem. To solve it, I’m going to use HISTOGRAM in a slightly non-intuitive manner (or maybe it is intuitive if you’ve read J.D. Smith’s Histogram Tutorial).

Continue reading “Finding the n smallest elements in an array.”

The TEMPORARY routine is a simple way to use memory more efficiently. Statements of the form

a = a + 5

create a temporary variable to hold the intermediate result a + 5. This temporary variable effectively doubles the amount of memory needed to store a (but only for the time necessary to do the operation). This is not necessary if we know that the expression a + 5 will be reassigned to a. In this case, a itself can be used as the temporary variable. Use TEMPORARY to indicate to IDL which variable can be used as the temporary variable:

a = temporary(a) + 5

Now, adding 5 will happen “in place” in the memory that holds a. Incidently, for operations like the above, an even more efficient way to perform this calculation is:

a += 5

There are composite assignment forms like the above for all of IDL’s binary operators.

On the other hand, TEMPORARY does not help for expressions where the variable to be reassigned to is used multiple times like

a = a^3 + a^2

Neither a on the right-hand side of this statement can be marked temporary because marking a variable as temporary makes the variable undefined (until it is reassigned to again). In this case, if the a in a^3 was marked as temporary, then the a in a^2 would be undefined. (And vice versa.)

The argument of TEMPORARY must be a named variable to have an effect. So

a = temporary(a + 5)

will perform its arithmetic correctly, but will create a temporary variable in doing it.