VERSION: FEBRUARY 2005

Click here for printable (B&W) version



A GUIDE TO IDL FOR ASTRONOMERS

R. W. O'Connell




Contents


I. INTRODUCTION

1. IDL In Context

The Interactive Data Language (IDL) is a proprietary software system distributed by Research Systems, Inc., of Boulder, CO (http://www.rsinc.com), now a division of Kodak. IDL grew out of programs written for analysis of data from NASA missions such as Mariner and the International Ultraviolet Explorer. It is therefore oriented toward use by scientists and engineers in the analysis of one-, two-, or three-dimensional data sets. RSI claims over 150,000 users.

IDL is currently available in UNIX, LINUX, Windows, and Macintosh versions for most of the popular scientific data processing platforms including Sun, HP, IBM, SGI, PC's and Mac's (see list at http://www.rsinc.com/idl/detail.cfm). IDL device drivers are available for most standard hardware (terminals, image displays, printers) for interactive display of image or graphics data.

The data reduction and display software that most astronomers are familiar with, including IRAF, STSDAS, AIPS, CIAO, MIDAS, and SUPERMONGO, consists primarily of specialized, task-oriented routines not intended for customization or enhancement by the user. These mostly function like "black boxes" and do not provide the user with easily understandable access to their inner workings.

By contrast, IDL is genuinely a computer language, readily understandable by any computer-literate user. It offers all the power, versatility, and programmability of high level languages like FORTRAN and C. But it adds three capabilities that are essential for modern data analysis: interactivity, graphics display, and array-oriented operation.

Users who are conversant with FORTRAN, C, C++, or other high level languages will have little trouble understanding IDL. Its syntax and operation are clear, sensible, and convenient (most similar to FORTRAN's). Because it is interactive, learning IDL through on-line trial-and-error is rapid.

IDL provides the scientist better understanding of and control over computations and data analysis by virtue of a large number of special features:

A functional description of IDL is available at http://www.rsinc.com/idl/detail.cfm.

Six versions of IDL have appeared to date. Version 6.1 is now the standard.


2. IDL Applications Packages

[Up to Contents]

Because of its power, versatility, transportability, and ease of use, IDL has already become the basis of a large, user-written public library of interactive software, now over 4000 programs.

How is IDL distinguished from other software readily available to astronomers?


Many astronomy-oriented IDL routines and packages are in the public domain.

The IDL source code for applications packages is automatically available, so the user can use the routines as written or, alternatively, readily modify their components (as one would FORTRAN subroutines) to customize them. Existing C or FORTRAN programs can also be executed from within IDL.

IDL is ideally suited for software exchange over the Web. Because no pre-compilation is required, installation of a new package, for instance, is simply a matter of putting the ASCII source files in your IDL path.

The built-in journal-keeping and command recall/edit features of IDL are so important to efficient and reliable data analysis that it is something of a mystery why most other astronomical software packages do not offer them.

The intrinsic capabilities of IDL coupled with its extensive user code libraries greatly enhance scientist efficiency.

For examples of IDL computational and graphical applications, run the idldemo demonstration.

Sources of IDL applications code:


3. IDL Limitations

[Up to Contents]

IDL has many virtues, but what are its limitations?

An obvious limitation, and a significant barrier for some people, is that IDL is a proprietary system, which means that each site must purchase an IDL license. Some astronomers object in principle to paying for software.

IDL is an interpreted rather than a compiled language. This means that large IDL programs can execute less rapidly than equivalent compiled programs written in FORTRAN or C.

IDL works best on moderate-sized data sets (say up to 500 MB) and where one does not need to reference individual array elements. Users needing to batch-process large amounts of data with more sophisticated algorithms (e.g. radio interferometry data with CLEAN) may find FORTRAN or C routines to be preferable. However, these can be linked into the IDL interactive environment and executed from within IDL.

Because of its optimization for interactive computing, IDL is not the system of choice for large scale numerical simulations. However, it is valuable as a medium to explore new computational approaches in a smaller setting where raw speed is not important, and it is also excellent for visualizing, analyzing, editing, and displaying numerical data sets generated by simulation software.

A problem for novices is that help with IDL may be hard to come by at a new installation. There are, however, consultants available at RSI, Web-based advice sites, and the examples of thousands of working IDL routines in the public libraries that can help solve many software difficulties. IDL's interactive operating environment makes debugging much easier than is typical of data analysis software.

Data reduction packages are not available in IDL for most of the specific instrumentation available at the major observatories (e.g. CCD mosaic imagery, multi-object spectrographs, echelle spectra, etc.).

Finally, the rapidly proliferating set of IDL applications routines is both a strength and a weakness. While one has access to a wide variety of useful software, this is not always fully tested since the authors typically apply it to problems of limited scope. The hardest part of using IDL often is determining what routines are available for a given application and deciding which is best to use.

Overall, in exchange for its improved versatility and power, IDL requires a higher level of computer skill than do systems like IRAF, AIPS, or CIAO.

On balance, IDL is an invaluable tool for most observational or theoretical astronomers.



II. GUIDE TO IMAGE PROCESSING
WITH IDL


[Up to Contents]

This section provides an introduction to intrinsic IDL and user-supplied applications routines frequently used in 2-D image processing. Only the most common options for each command are listed. Assumed are: UNIX, IDL V5.3 or higher, and a SUN Workstation environment. Most of the listed non-intrinsic procedures are Astronomy Users Library routines.

This guide is oriented toward the UVa/ASTSUN installation of IDL. However, I have tried to clearly distinguish details which are specific to the local system so that others will be able to use the guide.



1. THE IDL ENVIRONMENT, DEMOS, SOFTWARE LIST

The IDL Environment


Demonstrations: To get a feel for what IDL can do, try running the package of standard IDL demos supplied by RSI. Start from the UNIX prompt and type idldemo.


Tutorials: A set of 3 introductory IDL exercises which introduce its basic features is available on the UVa ASTR 511 home page. Other sites offering IDL tutorials are linked to the Astronomy Users Library.


Software List: Five different levels of IDL procedures and functions will be useful to you:

Acknowledgements: In any published work utilizing the Goddard software, the GSFC Astronomy User's Library group and Wayne Landsman should receive an acknowledgement. Modified versions of this software should propagate the authorship list in the header section of each routine.


2. STARTING & STOPPING IDL

[Up to Contents]

  • To customize the IDL environment (e.g. to create special windows for plots or establish main-level common blocks), you can execute special initialization files at the start of each session. IDL will always execute the special "Startup" file defined in the $IDL_STARTUP environment variable. A standard version of this file must be executed before the MOUSSE routines will run properly. This is done by default for UVa ASTSUN users. See the "Setup" section below.
  • To give UNIX system commands from within IDL: enter $ as the first character on the command line

  • To interrupt and resume IDL: use the standard ^z and fg UNIX commands.

  • To interrupt an IDL routine: type ^c. If in cursor mode, you may have to move the cursor to the active window and press mouse buttons to complete the interrupt. On some commands (e.g. array calculations or I/O) the interrupt may require some time to take effect. To continue the same routine, type .con. To exit the routine after an interrupt and return to the main level, type retall.

  • To repeat or edit and execute an earlier command.

  • To continue a long statement on the following line: end the line with a dollar sign ($). You may do this anywhere in the line where a space would be allowed except within a string variable. If you need to create a long string (e.g. in format statements), you can define pieces of the string on separate lines and then concatenate them. (E.g. stringtot = string1+string2).

  • To give multiple commands on a single line: separate them with the ampersand (&). E.g.:

    A small set of &-linked commands is a quick way to generate "micro-programs" which can be re-executed with a couple of keystrokes.

  • To leave IDL: type exit or ^d. All data and windows will be flushed. If you want to save data, use the save command or various other file writing commands before exiting.




    3. HELP

    [Up to Contents]

    Documentation: IDL is thoroughly documented in electronic and printed manuals. RSI issues a full set of manuals in PDF format with each license. If the PDF versions have been installed on your computer system, they can be accessed through the UNIX command idlman. The most important manuals are Using IDL, The IDL HandiGuide/Quick Reference, Building IDL Applications, and The IDL Reference Guide. These are accessible on-line from within an IDL session through a Hyperhelp facility.

    Learn by Example: One of the best ways to learn how to write and use IDL programs is simply to inspect existing IDL programs in the AstUseLib directories. You can more them in UNIX, or use the AstUseLib getpro routine to copy them to your local directory. To view public routines during an IDL session, type .run -t [routine name].

    Browser Access:

    IDL Links: Click here for links to other useful IDL resources

    Informational Commands


    4. PROGRAM EXECUTION

    [Up to Contents]

    All intrinsic IDL programs are compiled and ready for execution when you begin your session. Other programs are normally compiled only when you request them. A list of the latter is presented if you type help,/rou.

    NOTE: all IDL procedures/functions are assumed to be in files with the explicit extension '.pro'.


    5. IMAGE RETRIEVAL

    [Up to Contents]

    Data transfer to memory: During an active session, IDL maintains relevant data in random-access memory stored in variables with arbitrary names chosen by the user. The first step in IDL image analysis is normally therefore to read image data files from disk storage into IDL variables in RAM. These variables can be manipulated at will using arithmetic, extraction, compression, expansion, renaming, conversion, and a multitude of other built-in and user-supplied functions.

    Note that this is in contrast to IRAF/STSDAS, where there is no intermediate data storage, all manipulation involves files, and one must refer to images by their file names at all stages in the analysis process.


    6. IMAGE DISPLAY

    [Up to Contents]

    General:

    There are two steps in displaying an image on your computer screen.

    The resulting appearance on your computer screen therefore depends both on the array values and the color tables. Most of the standard display commands (e.g. ctvscl) transfer scaled data to the image buffer and automatically display the buffer with previously-defined color tables.

    For a "grey-scale" set of color tables, R(n)=G(n)=B(n). For a color display, the three vectors contain different entries. However, there is a direct correspondence between the intensity values in the buffer and the color which appears on the screen. Although it does not add fundamental information, pseudo-color display of a 1-byte buffer can be very useful in exploring different brightness levels in a complex image. True-color displays require the equivalent of three image buffers, each feeding a color gun independently.

    The human eye cannot really distinguish 256 levels of either grey scale or color, and astronomical images often contain much more than a 256:1 intensity range. Therefore, the hardest part of displaying images is selecting those ranges you want to display and making them distinct on the screen. Intrinsic IDL and the user libraries provide a variety of tools for doing this. Once you have made a good visible display, transferring it to hardcopy such that its quality is preserved is yet another challenging step. See Graphics Hardcopies below.

    Color Display Technical Issues:


    Commands:

    Note: The routines chan,cdel,ctv,ctvscl described below are MOUSSE versions of the intrinsic IDL routines window/wset/wshow,wdelete,tv,tvscl, respectively. They have several important convenience features (e.g. rescaling of window sizes to the image size, combination of the window set and show functions, etc.) and make use of common blocks which can be accessed by other routines. To use these routines, you must run the Mousse Startup File (see Appendix C).


    The ATV Image Display Tool


    7. IMAGE INSPECTION & MANIPULATION

    [Up to Contents]

    Inspection:


    Manipulation:


    8. IMAGE STORAGE

    [Up to Contents]


    9. IMAGE PHOTOMETRY

    [Up to Contents]


    10. ASCII FILES

    [Up to Contents]

    The best way to save small data sets (e.g. photometry output) is in the form of ASCII files, since these are easily edited and transported. Intrinsic IDL supports reading and writing ASCII files; see the IDL manuals. A sample script to write a file containing target names, coordinates, and brightnesses might look like the following:

    
    
           get_lun,unit
           openw,unit,'OutputFile'
           form='(a15,3x,f9.5,3x,f9.5,3x,f6.2)'
           for i=0,numtarg-1 do printf,unit,format=form,$
                   targid(i),radeg(i),decdeg(i),vmag(i)
           close,unit
    
     

    Handy utilities to read simple ASCII files consisting of separate columns of data are readcol, for free-format entries, and readfmt, for fixed-format entries. Both have the nice feature that they will skip over comment lines (or other lines with non-matching formats) without choking. An example of the use of readcol is given in Plotting Example 3 below. A related utility for printing several vectors to either the screen or an ASCII file is forprint.


    11. DATABASE ACCESS

    [Up to Contents]

    Thanks to the efforts of Don Lindler, Wayne Landsman and others at GSFC, the IDL Astronomy User's Library offers convenient and powerful access to a number of on-line databases. Information from these can be directly incorporated into your IDL image processing or other computational sessions. For instance, the MOUSSE routine tvdbase marks the locations of sources in selected catalogues on your current image display (assuming you have accurate astrometry for your image.) Individual commands in the database package are described on the IDL Astronomy User's Library home page. A more detailed manual (in PostScript) is also available.

    The databases must have been put in a special IDL-readable format before you can access them (commands to do this are part of the package). A selection of IDL databases of wider interest is publicly available from the IDL Astronomy User's Library.

    At UVa, approximately 95 such databases, weighted toward UV science, are available. They are presently linked to: /astro8/idl/zdbase.

    In order to use the databases, you must have defined the environment variable ZDBASE to point to the directory containing them.

    To see what databases are available, use the command dbhelp,1. To see what information is included in a given database and to display information for selected entries (here numbers 10,100,1000), type

       dbopen,'[data base name]
                 dbhelp,1
                 dbprint,[10,100,1000],'*
    
    To retrieve and use the data entries, you will need to use the more sophisticated commands described in the documentation cited above.


    12. PLOTS

    [Up to Contents]

    The basic IDL commands for making plots are plot, for creating a new plot, and oplot, for overplotting on an existing plot. Plots can be made to a terminal graphics window or to a variety of external devices.

    This, however, is only the tip of an immense iceberg. IDL contains many options for making plots---so many, in fact, that the hardest part of the job can be keeping track of the multiplicity of optional parameters. Options in the form of keywords can be specified in the calls to the plotting functions or they can be invoked in the form of system variables, such as !p.title, which will apply to all later plot calls until changed.

    The IDL defaults are not as "nice" as those in SUPERMONGO, for example. However, you can quickly customize to obtain as sophisticated a plotting style as you like. All the functionality of SUPERMONGO and other astronomical graphics packages is inherent in IDL. Many of the 2-D and 3-D graphics routines are illustrated in the IDL Demos which come with the system.

    Color tables for plots:


    Sample plotting scripts for displaying plots on your terminal follow. These involve a mixture of intrinsic IDL and AstUseLib routines:

    1. Plot galaxy surface brightness data (in magnitudes) with error bars versus radius to the one-quarter power. Use discrete symbols (not a connected line). On this plot, a galaxy with a standard "de Vaucouleurs" brightness profile will produce a straight line.

      Assume that the vectors flux, rads, and fluxerr already exist, with fluxes in units of ergs/s/cm^2/Angstrom. Assume that you must truncate the plot to eliminate the first 3 entries and those after the 20th.

       
      
      mags=-2.5*alog10(flux(3:19)) -21.1 ; Convert fluxes to magnitudes,
                                         ;    ignoring bad data
                                         ; Assumes all flux entries ge 0 
                                       
      
      r25=rads(3:19)^0.25                ; Compute fourth root of radius vector
      
      magerr=1.086*fluxerr(3:19)/flux(3:19)   ; Convert uncertainties 
                                              ;   in flux to magnitudes
      
      !y.range=[25,18]               ; Set a non-default y-axis range.
                                     ; Note: this magnitude scale has smaller values
                                     ;    higher on the y-axis
      
                                     ; Make the titles
      
      !p.title='Sample Surface Brightness Profile"
      !x.title='Radius(arcsec)^0.25'
      !y.title='Surface Bright (mags/arcsec^2)'
      
      plotsym,4,1.5,/fill            ; Choose filled triangles for plotting symbol,
                                     ;   50% larger than the default
      
      
      ploterror,r25,mags,magerr,psym=8
                                    ; Make plot on terminal screen with error bars; 
                                    ; The keyword psym selects the plotting symbol;
                                    ; Psym=8 specifies a user-created symbol, which
                                    ;    in this case was defined by plotsym
                                    ; Plot will appear in active window, or in 
                                    ;    Window 0 if none have been opened yet.
      
       
    2. Make a contour plot of a smoothed image.
       
      
      chan,3                              ; open plotting window
      
      !x.title='X'                        ; make titles
      !y.title='Y'
      !p.title=' Contours for Image'
      
      square                              ; set aspect ratio to make square plot
                                          ; (Note that you must also give this command
                                          ; after the PostScript device is called
                                          ; when making hardcopies.)
      
      smooth_one=smooth(image,5)          ; smooth the image by a 5 pixel boxcar
      
      clev=[10,20,40,80,160]              ; define trial contour levels--assume
                                          ;     these values span range of interest
      
      contour,smooth_one,levels=clev       ; do fast test of contour plot.  Check &
                                           ;   iterate clev for best appearance
       
      contour,smooth_one,levels=clev,/follow  ; do more accurate (slow) plot,
                                              ;  with labels
      
       
    3. Read, sort, and plot data from an ASCII file; make & display trial polynomial fits

      Assume x is a vector containing values in the range 0 to 5 and that y is the corresponding dependent variable. Assume that these are to be read from an ASCII file named xy.dat, which contains x and y in separate columns. xy.dat can contain an initial explanatory section and other separator headers, as long as none of these contain only one or two floating point numbers (since readcol will mistake those for data lines). The readcol routine will read in the numerical x,y data ignoring (in this example) any line containing alphabetic characters. No labels are put on the plot in this example.

       
      readcol,'xy.dat',xx,yy         ; Read data from the file.  Note that format 
                                     ;    statements are not required
      
      index=sort(xx)                 ; Sort the arrays in order of increasing x 
      x=xx(index)
      y=yy(index)
      
      chan,1                         ; Open Window 1 for plotting
      
      plot,x,y,psym=5                ; Plot the data with open triangles
      
      quadcoeff=poly_fit(x,y,2)      ; Derive coefficients for best quadratic
                                     ;    polynomial fit
      print,quadcoeff                ; Print these out (optional) [Note: 
                                     ;    quadcoeff is a vector]
      
      tx=findgen(101)/20.            ; Create independent variable vector for fitted
                                     ;    values (uniform interval of 0.05 x units)
      
      quadtesty=poly(tx,quadcoeff)   ; Create the fitted quadratic values
      
      oplot,tx,quadtesty,psym=1      ; Overplot the quadratic fit with plus signs
      
      cubcoeff=poly_fit(x,y,3)       ; Derive coefficients for cubic fit
      cubtesty=poly(tx,cubcoeff)     ; Create the fitted cubic values
      
      oplot,tx,cubtesty,psym=3       ; Overplot the cubic values with small dots
      
      delta=y-poly(x,quadcoeff)      ; Compute the difference between y data and
                                     ;    the best quadratic fit 
      
      nix=where(abs(delta) gt 2.)     ; Locate those y values which are more
                                      ;    than 2 units from the best fit
      
      weight=fltarr(n_elements(x))+1. ; Create a weight vector corresponding to
                                      ;     x with unit entries
      
      weight(nix)=0.0                 ; Give deviant points zero weight
      
      newquadcoeff=polyfitw(x,y,weight,2) ; Derive improved quadratic coeff.
      
      final=poly(tx,newquadcoeff)         ; Create improved fit values
      
      chan,2                         ; Open Window 2 for final, clean plot.
                                     ; (Window 1 is retained for comparison.)
      
      plot,x,y,psym=5,xrange=[1,3.5] ; Plot the data with open triangles in
                                     ;    Window 2.  
                                     ; Assume interest is limited to only 
      		 	       ;    part of data x-range.
      
      oplot,tx,final,psym=0          ; Overplot final fit with solid line
       

    13. GRAPHICS HARDCOPIES

    [Up to Contents]

    The most common method of obtaining hardcopies or permanent storage of graphics output (plots or images) is to use PostScript files, since these can be printed on most laser printers. PostScript files can be later edited and reformatted, though special (non-IDL) programs are needed. IDL also supports output of GIF, JPEG, TIFF, SRF, and other file formats. GIF and JPEG are standard for Internet Web browsers. GIF or TIFF are recommended for transporting files to local vendors to make photographic prints or slides.

    You should always experiment on the terminal screen with your plot format before dumping to an output file. It is easy to do this by working out the set of commands you want by plotting to the screen, then typing set_plot,'ps (in the case of PostScript output) and repeating the commands using the command recall buffer.

    For more complex plots, use the journal utility, then edit and re-execute the resulting file (or cut and paste across windows).

    The set_plot command determines which output graphics device you are using. The most common versions of this command are:

     
            set_plot,'x  :  Send output to X Windows (default) 
            set_plot,'ps :  Send output to the PostScript file "idl.ps"
    

    The subsequent commands for sending data to the PS file are (mostly) the same as for putting data on your monitor screen, since monitors and PS files are interchangeable output devices for IDL.

    You can always check the properties of the current graphics output device by typing help,/dev. You can change these defaults by using the device command. The hardware/software interfaces are sometimes non-trivial, and you will want to plan for a significant learning curve in doing things which are not "vanilla." Before sending large jobs to printers, vendors, etc., be sure to check the files using UNIX ghostview, xv, or other screen display programs.


    Here are some graphics output methods for common situations:

    Technical Issues:


    14. IDL HINTS AND ANNOYANCES

    [Up to Contents]

    Not many people have experienced fully interactive computing before they start to use IDL. There are tremendous advantages but also many pitfalls for the unwary. The pitfalls will, of course, mostly seem obvious and trivial in retrospect---i.e. after you have learned to avoid them. A number of tips & warnings for IDL beginners are discussed in this section.


    Paths, Procedures, Directories