Response to Why IDL Sucks
posted Tue 14 Nov 2006 by Michael Galloy under IDLI saw the discussion on comp.lang.idl-pvwave
about Why IDL Sucks. While the author probably just needed to vent his frustrations on not getting IDL to do what he wanted, I think it’s worthwhile to examine the complaints because I think many programmers new to IDL may have many of the same misconceptions (think of it as an addendum to my 12 tips for beginning IDL programmers article).
-
Integer-to-float conversions. IDL will promote operands to match the precision of the most precise operand. So
7 / 2 = 3
since they are all integers. When either operand is a float, you get a float result:7. / 2 = 7 / 2. = 3.5
. This is “the right thing” to do (as evidenced that every dynamic programming language I know does it this way). -
The compiler. Welcome to the world of dynamic languages! IDL doesn’t know about many errors until it executes the code because things can change right up to that point. This is a feature (though of course it has a downside and isn’t appropriate for every programming task). This is no different than Python, Ruby, or any other dynamic language. Not sure what you mean by “when IDL actually does manage to catch an error, there’s about a 0.00001 probability of it telling you what line the error is on.” It always tells me. Are you using
CATCH
orON_ERROR
? I would recommend you use them only when you have to and when you know what you are doing. -
FOR
-loops. “This inadequecy is completely inexplicable.” There are loops in IDL, so you must be talking about the fact that vectorized operations are faster than loops. This is the same for other interpreted languages, most notably see Matlab and Python (with it’s NumPy, numeric, etc. numerical packages). -
Recompilation. IDL doesn’t automatically save compiled routines to files. That would be a nice feature. Check out the
SAVE
to do it manually. -
Pass by reference…sometimes. Yes, IDL passes named variables by reference. “You can’t expect a programmer to keep track of when he’s passing by reference and when he’s passing by value unless you make explicit use of pointers.” I’m not sure why that would be; if you’re passing a named variable, you’re passing by reference.
-
Column-major array ordering. This is because IDL was designed to handle images. So IDL uses image-processing convention of
[column, row]
i.e.[x, y]
. “Every other language in the history of mankind is row-major.” Don’t tell Fortran that. -
Redefinition of structures. You can’t redefine a named structure without doing a
.reset
(or exiting IDL). That’s the reason you would use a named structure. Anonymous structures can be redefined all you want; named structures enforce consistency. -
Ugly Histograms. Yes, there are some issues. Not sure which you are talking about here, though.
-
Anti-parallelism. Check out the thread pool. For more control, check out Tech-X’s FastDL.
-
Integer indexing of
FOR
-loops. Yes, it’s odd that unlike any other named variable in IDL, the index of a FOR loop doesn’t change type as needed. It gets its type from the start value of the FOR loop and it doesn’t change inside the loop. (This probably has to do with some kind of internal optimization to try to makeFOR
loops a bit faster, but I’m not sure really why this is, it just is.) -
Data-type comparison inconsistencies. “But for 0.2, the float type has roundoff error.” So does the double.
IDL> print, 0.2D, format='(F25.20)'
0.20000000000000001110
This article has useful information that every IDL (or scientific) programmer needs to know.
- Logical operators with floats and ints. IDL chose Fortran style over C style for the truth value of integers. You can chose the C style with
compile_opt logical_predicate
- Non-existent variables still exist. This is IDL’s version of Schrodinger’s cat because naming a variable brings it into existence (as an undefined variable). “Existing” really isn’t a useful property. “Defined” is more useful. Then, your test
print, n_elements(stupidvariablethatIhaventdefinedyet)
0
tells you exactly that stupidvariablethatIhaventdefinedyet
isn’t defined.
- Global variables in
FOR-loops
. I’m not sure why you can’t do this, but it probably has to do with point 10 above. System variables are just one kind of global variable in IDL. I would try to stay away from them as much as possible. Try heap variables (pointers and objects). Common blocks and system variables have their uses (I suppose), but they are asking for trouble.
So, yes, there are some things to watch out for in IDL. But with a little forewarning, you can get IDL to solve your problems (and probably much faster than constantly recompiling your 100 times longer C program).
January 3rd, 2007 at 4:44 pm
Regarding number 5, it might *appear* that IDL is passing by value when passing something other than a named variable, for example, “plot, findgen(100)*3”, “print, a[10:200]”, etc. But these statements are still passing their data by reference. That is, they are embedded in the context of an IDL_VARIABLE structure just as a named variable would be.
The distinction is that there’s a flag in the IDL_VARIABLE structure that defines whether the data are the result of an expression or a constant definition. This signals to the interpreter, among other things, whether or not to discard the contents when the called routine exits.
This is one reason why you can’t populate a single array element (or range of elements) via READF, for example. Consider the following:
A = FLTARR(10)
READF, LUN, A[0]
You won’t get an error, but you also won’t change the value of A[0].
“A[0]” is an evaluated expression rather than a “pointer” to the first element in the vector. Before READF is called, this expression will be evaluated to a floating point scalar of value 0, a copy of the array element’s value, and will be placed into its own IDL_VARIABLE. IDL will dutifully read a value from file and place it into this expression’s memory. But as soon as READF is finished, the temporary memory is discarded by the interpreter and the value read from file is lost.
January 8th, 2007 at 11:34 am
0. My biggest beef with IDL is that it doesn’t test for integer overflows. If I had a nickel for every time I’ve looped off the end of an index. I routinely overflow other integers too, such that I almost always declare integers as LL when I make them. 32768 isn’t big enough. Furthermore, IDL doesn’t implement any form of arbitrarily sized integers.
1. Matlab returns 1.5, Lisp and Scheme return 3/2. I find the integer division behavior extremely unintuitive, and I dislike it in Python and Ruby every bit as much as IDL. If I want integer division, I’m happy to ask for it explicitly with an appropriate rounding function. Sometimes I divide integers that I didn’t expect to divide when I defined them, like loop indexes, and then I get bit.
5. This really bothers me. I hate it when a function I write inadvertently changes the value of a keyword parameter.
8. The default histogram output is way too ugly to use. You always have to catch the output density array and hand format it. Also, the REVERSE_INDICES return parameter is simply pathological. Could they have possible come up with a way to return that information that would be even more prone to off-by-one bugs? REVERSE_INDICES is a perfect application for ragged arrays, if IDL supported them.
15. The absence of a native hash function. Entirely inexcusable.
16. In ability to implicitly loop over arrays and structures. I’d like to be able to do a “for_each_element(arrayOfStrings)
17. Absence of an implicit “end of array” variable. In matlab you can do x[5:end], in Ruby and others you can do x[5:-1], in IDL you need to do x[5:n_elements(x)].
18. Poor string support. IDL’s regex support is pretty limp. It is extremely difficult to parse data that isn’t strictly organized by position. Also, no support for multi-line strings. Also, I think it’s starting to be a little embarrassing when languages don’t support Unicode encodings these days, but I certainly wouldn’t hold my breath for it in IDL.
19. As nice as “where” is, it’s a huge pain to test for -1 every time you want to use the results. In many languages you
20. No package support whatsoever, unless you count “prefix the functions in your library with your initials to avoid namespace conflicts”. And the load path has to be defined in the shell environment? What’s up with that?
21. IDL is so far away from unit tests that most people in the community haven’t even heard of them.
22. sqrt(-1.0) throws an error. I don’t think I’d ever complain if IDL went ahead and cast the answer to a complex.
23. Graphics output can be a hassle. For example, it’s pretty much impossible to write code that can easily switch between PS and PNG output.
24. I wouldn’t mind color management being a little more automatic. There’s no reason we should have to load a whole third party library (FSC_Colors) for that stuff.
IDL does have some nice bits. Keyword parameters are awesome, and it’s map plotting is pretty sophisticated. But it isn’t enough, in my opinion those are the only things it does better than matlab, and they don’t give it any edge at all on Python (especially if one uses a high performance numerics module). Frankly, even the extensive public libraries for astrophysics and the like wouldn’t justify IDL in mind, even if it were free.
January 8th, 2007 at 12:36 pm
You point out some things that I have problems with also, though there are a few things I will defend.
0. Yes, I think there should be at least a math error when you overflow an integer. I don’t think this is something that can be changed without losing backward compatibility though. Note: you can use “compile_opt defint32” to get 32-bit integers by default (yes, you have to do it in each routine that you want it).
1. My understanding is that Matlab didn’t really have other types except for double until recently (not really sure since the last time I used Matlab was more than 15 years ago), so it’s not surprising that it returns 1.5. I like the general rule that the result will be the same type as the highest precision operand.
5. Don’t change the value of a parameter unless it’s an output parameter.
8. I agree that REVERSE_INDICES is a bit sadistic. Ragged arrays would be nice or at the very least returning the info in two arrays instead of slapped together in one.
15. I haven’t missed it.
16. Yes, a looping construct like Python’s “for x in var:” would eliminate some busy work in a lot of loops.
17. Use * for the end of the array. x[5:*] will work.
18. There is limited support for Unicode, but, yes, it’s not as easy as in more modern languages.
19. I agree the special case of -1 for no elements found in WHERE (and several other routines) is a pain. Empty arrays would be very nice. (Of course, it would break just about every piece of code in existence since everyone uses “N_ELEMENTS(var) EQ 0” to determine if a var is defined.)
20. Namespaces would be great and it seems like it could be added without breaking backward compatibility. Existing routines would use something like Java’s anonymous package.
21. Native support for documentation and testing are both widely needed for large application (or library) development. I have created a documentation utility and a unit testing framework, but something built-in would be used more frequently.
22. Also would be nice if sqrt(-1.0) would be converted to complex, but it follows the “result is the same type as the highest-precision operand” rule which seems like a good general rule.
23. In object graphics, this isn’t bad.
24. That would be nice.
I’ve been using Python a lot recently and agree Python has a lot of advantages over IDL. One advantage that IDL has is that an IDL install contains “most everything” you need. In a current Python project, I have to install Python, NumPy (for numerics), Qt/PyQt (for widgets), HDF/PyTables (for HDF 5 support), matplotlib (for 2D plots), and VTK (for 3D plots). Each package has dependencies on the other packages so I must make sure I have the correct versions and not just the latest. The other advantage IDL has is that there is no maze of licenses that you will have to navigate in Python.
January 8th, 2007 at 7:01 pm
Hi Michael, great response. I apologize for the whiney post, I get cranky when I work late sometimes. I complain about languages in a way that is pretty unfair: if they do something that isn’t exactly my favorite way, then I hate them for it. My complaint list for matlab is just as bad as for IDL (Whose idea was it to use parenthesis for both function calls _and_ array indexes? And don’t get me started on their plotting system.)
15. A lot of applications get by fine without hash tables, but when you want them you really want them. I have a couple hundred magnetometers I’d like to be able to refer to by name. In ruby or python I could make a hash table, magnetometers, and do things like MEA_data = magnetometers(“MEA”). In IDL the closest idiom would be to make and array of magnetometer structures with a name field, and the do a MEA_data = magnetometers[where(magnetometers.name eq “MEA”)], which is nowhere near as clean. Furthermore, it doesn’t scale at all. It works ok in my case, with a few hundred records, but imagine you have brightness data for 200,000 stars that you need random access to. You need a hash table.
Furthermore, ruby, python, and lisp let you iterate or map over the elements in a hash (and most other data structures).
Matlab has this same shortcoming. IDL and Matlab both seem to think that all data is neatly referenced by integers, but it’s not.
8. Histogram: I think the problem here is that it’s too powerful and useful. I think histogram should probably be re-implemented as a special case of a function like sort_into_bins, or something like that. Rather than bolting so many powerful features onto a plotting routine.
17. That’s helpful, thanks. But you still can’t do x[4:(*-3)], while you can do x[4:end-2] or x[4:-2] in other languages. It’s syntactic sugar, maybe, but I’d use it a lot.
19. I’m not sure what the best solution would be because I don’t think any other languages have a where() that’s as powerful or useful as IDL’s.
21. I should have said this sooner: Thanks a TON for working on a unit test framework. It’s sorely needed.
23. I haven’t done any object graphics. Please correct me if I’m wrong, but I’ve gotten the feeling that object graphics are painfully low level. I pretty much just make plots and maps. Sometimes they get a little complicated, but not so much that I want to have to do ticks or axises by hand. That said, objects are fundamentally a better way of doing graphics then the painfully imperative style of non-object graphics in IDL, or “handles” in matlab.
Other things:
IDL’s assoc() is a real nice way to get mem-maps. I wish it were a little easier, though. My biggest complain is that when I write the code to read a assoc array I rarely remember (or even know) the array structure/size that I wrote. I wish that information was stored in a header for the array. I’ve thought many times about writing a wrapper for it like Liam Gumley’s binread and binwrite, but never actually bothered to do it.
IDL also is relatively nice about endian problems, with /swap_if_big_endian pretty much everywhere I’ve wanted it. Many of my friends who use other languages think that binary data is fundamentally incompatible between hardware. They don’t understand it’s just bit ordering.
January 8th, 2007 at 9:12 pm
I understand where the complaints are coming from–I have plenty myself. Most of the stuff I post on this site is “scratching my own itch”. But there are plenty of things that really can’t be done well by a third party in a closed source system.
15. I wasn’t sure what you meant by “native hash function”. I thought you meant something like Java’s Object.hashCode() where everything has a hash code built into it. I agree that hash tables are very useful. Try this implementation. I agree that something built in that other routines knew about would be better.
17. Yes, I find that very useful in Python; I know I would use syntax like arr[4:-2] quite a bit.
19. I found Python’s numpy.core.ma.where and masked arrays to be very useful looking. I don’t have a lot of experience in this yet, so I’ll hold off judgement.
21. No problem. I see Rob put his code out there too (docs).
23. I don’t think object graphics are “painfully low level.” They are definitely not the easiest way for doing simple things, but I find them easier than direct graphics for more complicated tasks. Once you get something showing on the screen, you’re 90% there.
June 20th, 2008 at 10:39 am
IDL sucks!
Main – IDL designers screw every standard of language. Why I do not need to put BEGIN but always put END in program? Huh? Any ideas?
Why these ENDFOR and ENDIF exist at all? Why linebreak is the end of the operator, the designers only use A=B+C type formulas and never heard of anything longer?
2. Who cares about world of dynamic languages? In perl you got a warning of typos, in IDL you can use un-initialized variable and never know of it! Consider code
Var=1
Var1=2
C=Var+2
Perl reports, that Var1 used only once and it might be a typo. IDL does not.
6. Why the hell I care why IDL was designed? Obey the standard way of ordering. In the meantime, why I can adress arrays using a(1,2) and a[1,2] both?
What do you think about IDL help system? Why so many pages? Why they use PDF in 6.0, then their own browser in 6.2, then web-approach in 7.0? Why for each command keywords scattered here and there? Hey, I do not have time to read 10 pages per operator, I need one document explains 100% about the PLOT, and with damn examples, too. Not 10 pages. This is just sick.
I have so many complaints about IDL. Guys, learn C++ and forget that ugly monster. It should just become bankrupt and get the hell out from the way of making science and development. It’s a WRONG tool to deal with.
June 20th, 2008 at 12:04 pm
@Askar
I agree IDL’s syntax is not graceful (or even consistent). But it does the job nicely.
Linebreak ends a statement unless there is a $ at the end of the line. But there is no ; to put at the end of every statement. Are there more statements or continuation lines in a program?
ENDIF and ENDFOR exist to give more useful warnings to the programmer.
2. More warnings would be useful.
6. “Obey the standard way of ordering.” There is no standard way. IDL’s ordering is quite useful for people doing image processing (which is why it was designed that way). Being able to access arrays with ()’s was a mistake and backward compatibility makes it impossible to completely fix. Use “compile_opt strictarr” and []’s.
I don’t like the help system (at least for certain versions of IDL and I don’t like that it has changed so many times). But the help content is pretty good. There are 82 keywords to PLOT, 73 of them are found in several other graphics routines. I think it makes sense to put the documentation for them in a shared place and put a link to it in the help for PLOT.
C++ is not the answer to being able quickly read data, analyze it, and plot it.
September 4th, 2008 at 9:43 am
Hello Guys…
Honestly, I support the idea that IDL is not a good language, but hopefully, imho it will die with this scientific generation. Just check the age of people using it: its generally the older ones (who, btw, still think Fortran is great)… hardly any computer literate PhD student uses IDL for their independent new development (old code is another issue…) and only those fellows that ‘follow whatever their advisor uses’ are still running this language…
And Michael, about C++ being not the answer for quick data analysis, I partially agree – but only partially, because if you use CERN’s ROOT you have almost everything you need…