Atle Borsholm recently posted a clever solution for finding the n-th smallest element in an array on the IDL Data Point. He compares this to a naive solution which simply sorts all the elements and grabs the n-th element:

IDL> tic & x = ordinal_1(a, 123456) & toc
% Time elapsed: 3.0336552 seconds.

His solution performs much better:

IDL> tic & x = ordinal_2(a, 123456) & toc
% Time elapsed: 0.46286297 seconds.

I have a HISTOGRAM-based solution called MG_N_SMALLEST in mglib that can do even better:

IDL> tic & x = mg_n_smallest(a, 123456) & toc
% Time elapsed: 0.18394303 seconds.

Note: MG_N_SMALLEST does not return the n-th smallest element directly, but returns indices to the smallest n elements.

I have a more detailed description of what MG_N_SMALLEST is doing in an older article. I like this routine as a good example of using HISTOGRAM and its REVERSE_INDICES keyword. It also a nice example of when using a FOR loop in IDL isn’t so bad.

I include even more detail on this routine in the “Performance” chapter of my book.

Mike Bostock has created some really great visualizations of sampling, shuffling, sorting, and maze generation algorithms. He ends with a quick discussion of using vision to think, using his NYTimes interactive graphic “Is It Better to Rent or Buy?” as an example:

To fix this, we need to do more than output a single number. We need to show how the underlying system works. The new calculator therefore charts every variable and lets you quickly explore any variable’s effect by adjusting the associated slider.

via Flowing Data

My MacBook Pro has three OpenCL devices: a CPU, an integrated GPU, and a discrete GPU. I was interested in the performance I could get with my OpenCL GPULib prototype on the various devices, so I ran the benchmark routine on each of them. CL_BENCHMARK simply computes the gamma function for an array of values; see the results for various array sizes below.

Gamma computation performance on host and various OpenCL devices

There are several interesting points to these results:

  1. the discrete GPU did not have the best performance
  2. the CPU OpenCL device performed better than the host, i.e., the CPU, for more than a few million elements

Contact me if you are interested in the GPULib OpenCL prototype (still very rough).

Here’s the details on the various OpenCL devices on my laptop:

IDL> cl_report
Platform 0
Name: Apple
Version: OpenCL 1.2 (Apr 25 2014 22:04:25)

  Device 0
  Name: Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
  Global memory size: 17179869184 bytes (16384 MB)
  Double capable: yes
  Available: yes
  Compiler available: yes
  Device version: OpenCL 1.2
  Driver version: 1.1

  Device 1
  Name: Iris Pro
  Global memory size: 1610612736 bytes (1536 MB)
  Double capable: no
  Available: yes
  Compiler available: yes
  Device version: OpenCL 1.2
  Driver version: 1.2(May  5 2014 20:39:23)

  Device 2
  Name: GeForce GT 750M
  Global memory size: 2147483648 bytes (2048 MB)
  Double capable: yes
  Available: yes
  Compiler available: yes
  Device version: OpenCL 1.2
  Driver version: 8.26.21 310.40.35f08

Here are the slides from my talk about GPULib and FastDL today to the Scientific Programming in IDL class at the Exelis VIS office.

Motivated by this idiom from Python, I have been experimenting with using a small helper routine from my library, MG_ANY. MY_ANY flips around the return value and count parameter of WHERE and is used something like the following:

if (mg_any(condition, indices=ind)) then begin
  ; use ind for purpose
endif

This replaces the following standard code where count is checked after the fact (and often neglected by novice programmers):

ind = where(condition, count)
if (count gt 0L) then begin
  ; use ind for purpose
end

To me, this small change of flipping the return value with a value passed back through a parameter/keyword makes this more readable and harder to write a bug into.

I cleaned up a few visual bugs in the beta and IDLdoc 3.6 is ready for release. Get the new version at the (releases)1 page of the GitHub IDLdoc wiki. Features are the same as listed for the beta:

  • Checks for updates when using the VERSION keyword.

  • Added Exelis VIS Doc Center output.

  • Provides links to IDL library routines referenced in rst markup code syntax.

  • HTML rst markup directive to include HTML directly into output (contributed by Phillip Bitzer).

  • Reporting only non-empty, non-comment lines in routines/files now.

  • Improved algorithm for computing cyclomatic complexity and also reporting modified cyclomatic complexity.

  • Updated to MathJax 2.0 and using the complete MathJax distribution for better LaTeX rendering.

  • Listing methods inherited from parent classes.

  • Miscellaneous bug fixes.

UPDATE 6/11/14: I released 3.6.1 adding some missing library routines in the .sav file. Download is on the same page.


  1. You can always get the latest version from the GitHub repo

Thanks to Eric Bellm‘s idlmagic, it is now possible to use IDL inside an IPython notebook! For example, here’s the HTML output of the first section of Modern IDL as an IPython notebook (download notebook). Note there are some mistakes still, notably the printing of a string as a long array.

If this is not exciting to you, you need to learn more about IPython notebooks.

Careers after a math degree Ben Schmidt has made an interactive visualization to explore the careers of college graduates using Sankey diagrams from D3 in Javascript. I hope I need to make a visualization for the web soon, so that I can play with D3.

Why is funeral service so popular with math majors (though I’m not sure what “Miscellaneous manager, including Funeral Service” really means)?

via FlowingData

I haven’t released IDLdoc for awhile and quite a few new features have accumulated1 (download):

  • Checks for updates when using the VERSION keyword.

  • Added Exelis VIS Doc Center output.

  • Reporting only non-empty, non-comment lines in routines/files now.

  • Improved algorithm for computing cyclomatic complexity and also reporting modified cyclomatic complexity.

  • Updated to MathJax 2.0 and using the complete MathJax distribution for better LaTeX rendering.

  • Listing methods inherited from parent classes.

  • Miscellaneous bug fixes.

  • Provides links to IDL library routines referenced in rst markup code syntax.

  • HTML rst markup directive to include HTML directly into output (contributed by Phillip Bitzer).

If you check it out, please let me know if you have any issues. Formal release should happen in the next week or two.


  1. You can always get the latest version from the GitHub repo

Strava is a popular app for tracking your runs and rides. Heatmap of bicycle use But the accumulated data that can be explored through the Strava Labs Global Heatmap is amazing! It is fun to look around and explore my area’s use for rides and runs. (But, I could have told you that 4th St. was bike highway for people leaving town.)

Looks like Strava, is now selling accumulated data to cities to help plan better bike routes. Portland is in.

via FlowingData

older posts »