Eureka! Erm… Ureka!

What is Ureka?

Ureka serves as an all-purpose science solution, that allows users to test different python modules and environment configurations separate from their standard shell environment. The software distribution also includes many common science tools, such as, IRAF 2.16, SAO DS9, S(ource)Extractor, Python (astropy, numpy, scipy, etc). Ureka’s subset of GNU libraries is quite extensive for a relatively small package (~3GB). A bulk of its total size is IRAF, which is an unavoidably huge science package, but Ureka has outdone itself by taming the beast into a self-contained structure (unlike installing it from scratch), so you can spend more time focusing on functional work, rather than troubleshooting missing trailing slashes.

Ureka’s core relies heavily on virtualenv and makes creating “variants” of your current installation extremely easy. As a developer, if you require ten different Python installations this distribution should make your life much easier.

Installation

Refer to this page for more detailed installation information.

The shortest quickstart guide in the history of science software, ever:

$ wget http://ssb.stsci.edu/ureka/1.4/install_ureka_1.4
$ chmod +x install_ureka_1.4
$ ./install_ureka_1.4
...
$ source ~/.bashrc
$ ur_setup

Performance

Compared to other software distributions in the same league, such as Anaconda, Ureka is on-par. The added benefit of Ureka’s built-in complement of science tools makes everything a bit more palatable. Would you rather compile IRAF and link it against Anaconda’s libraries all by your lonesome? How about SAODS9? I seriously doubt it. Ureka has the upper hand in this regard but there is one area where performance falls unavoidably short: ATLAS (Automatically Tuned Linear Algebra Software).

Anyone I have ever spoken with about ATLAS, even mentioning it in passing, starts out the same way: The person involuntarily rolls their eyes and lets out a huge painful sigh as if they are reliving a recurring nightmare from their childhood. Every single person complains it was one of the most annoying pieces of software to compile, ever.

ATLAS is “automatically tuned” (i.e. it benchmarks itself against your physical system hardware at compile-time) and therefore makes releasing an even marginally functional pre-compiled shared library impossible.

Ureka’s linear algebra performance on Linux is not very impressive. All hope is not lost! If you are willing to give up an afternoon… Manually compiling ATLAS+LAPACK(Tuned) then re-linking SciPy and NumPy via pip works fantastically. I’ve written a script to do this automatically, so when I’ve ironed out all the kinks I’ll post a how-to article.

The Science Software Branch at STScI has gone through a lot of effort to make this distribution easy to use on a day-to-day basis. So even if you’re not receiving the best number crunching capabilities out of the box, you’re already making up lost time not building/installing extremely difficult to understand, often poorly documented and barely maintained scientific tools that everybody relies on.

Note

ATLAS performance degradation is limited to Linux only.
OSX provides VecLib (Apple’s in-house build of ATLAS/BLAS/LAPACK) which Ureka’s linear algebra dependent software is linked against by default. So if you own an iThing you’re OK for the time being.

Potential Caveats

  • Upgrading the Ureka distribution destroys changes made by the user.
  • Wrapping of system-provided utilities could lead to unwanted effects.

I understand the need for some of these hacks from a development standpoint… I am not sure I agree with them being implemented in a production build.

Environment Variables

IRAF’s nightmare of a FORTRAN compiler wrapper is unfortunately here to stay, so if you decide to compile code while Ureka is active, due to library linkage issues you may need to redefine the following variables (i.e., F77=gfortran, F2C=gfortran) for normal compilation to work again.

$ which $F77
/opt/Ureka/iraf/unix/hlib/f77.sh

$ which $F2C
/opt/Ureka/iraf/unix/bin.linux/f2c.e

GCC

The wrapping of GCC can be undesirable if you rely on a tool-chain other than the system’s default.

$ which `gcc`
/opt/Ureka/python/bin/gcc

The following error is benign but may be confusing to a newcomer:

$ gcc
Undefined symbols for architecture x86_64: "_main", referenced from: 
    implicit entry/start for main executable
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Because gcc returns 1 instead of 0, this behavior may interfere with scripts that check for the existence and/or functionality of GCC.

PKG-CONFIG

If you rely on pkg-config (most GNU software does) consider removing this wrapper script, or be prepared to set -L and -I arguments manually in CFLAGS/LDFLAGS, etc.

$ which pkg-config
/opt/Ureka/python/bin/pkg-config

$ cat `which pkg-config`
#!/bin/sh

case "$*"
in
    *freetype2*|*cfitsio*)
        /usr/bin/pkg-config $*
        ;;
    *)
        exit 126
        ;;
esac

OTOOL (OSX)

If otool doesn’t exist on your Apple iThing there is a substantially high probability Ureka isn’t the only piece of software no longer able to check library linkage. However, it does appear to be supplied within the distribution.

$ which otool
/opt/Ureka/python/bin/otool

$ diff /usr/bin/otool `which otool`
Binary files /usr/bin/otool and /opt/Ureka/bin/otool differ

INSTALL_NAME_TOOL (OSX)

This Apple iThing system utility for changing RPATHS is included in the distribution as well.

$ which install_name_tool
/opt/Ureka/bin/install_name_tool

$ diff /usr/bin/install_name_tool `which install_name_tool`
Binary files /usr/bin/install_name_tool and /opt/Ureka/install_name_tool differ

Operation not permitted: Sticky and SETGID

  1. User “foo” creates a directory with group read-write and sticky, without setgid
    [foo@localhost ~]$ mkdir -p /tmp/test_sticky/test_setgid/{sub1,sub2,sub3}
    [foo@localhost ~]$ chmod 3775 /tmp/test_sticky
    
  2. User “foo” then creates a group read-write, setgid, non-sticky directory structure with few files underneath.
    [foo@localhost ~]$ chmod 2775 /tmp/test_sticky/test_setgid
    [foo@localhost ~]$ find /tmp/test_sticky/test_setgid -type d -exec chmod 775 "{}" \;
    [foo@localhost ~]$ touch /tmp/test_sticky/test_setgid/{sub1/file1,sub2/file2,sub3/file3}
    
  3. The resulting structure looks like this:
    [foo@localhost ~]$ find /tmp/test_sticky -printf "%#m:%M:%u:%g:%p\n"|sort -n
    0644:-rw-r--r--:foo:shared:/tmp/test_sticky/test_setgid/sub1/file1
    0644:-rw-r--r--:foo:shared:/tmp/test_sticky/test_setgid/sub2/file2
    0644:-rw-r--r--:foo:shared:/tmp/test_sticky/test_setgid/sub3/file3
    02775:drwxrwsr-x:foo:shared:/tmp/test_sticky/test_setgid
    02775:drwxrwsr-x:foo:shared:/tmp/test_sticky/test_setgid/sub1
    02775:drwxrwsr-x:foo:shared:/tmp/test_sticky/test_setgid/sub2
    02775:drwxrwsr-x:foo:shared:/tmp/test_sticky/test_setgid/sub3
    03775:drwxrwsr-t:foo:shared:/tmp/test_sticky
    
  4. User “bar” attempts to remove /tmp/test_sticky/test_setgid:

    [bar@localhost ~]$ rm -rfv /tmp/test_sticky/test_setgid
    removed `/tmp/test_sticky/test_setgid/sub3/file3`
    removed directory: `/tmp/test_sticky/test_setgid/sub3`
    removed `/tmp/test_sticky/test_setgid/sub2/file2`
    removed directory: `/tmp/test_sticky/test_setgid/sub2`
    removed `/tmp/test_sticky/test_setgid/sub1/file1&`
    removed directory: `/tmp/test_sticky/test_setgid/sub1`
    rm: cannot remove `/tmp/test_sticky/test_setgid`: Operation not permitted
    

    The sticky bit set by “foo” on /tmp/test_sticky prevented “bar” from deleting
    /tmp/test_sticky/test_setgid, effectively overriding the setgid permissions.

  5. Deleting the test_setgid directory as “bar” without the sticky bit enabled
    [ As "foo", drop the permissions back to setgid only: chmod 02775 /tmp/test_sticky ]

    [bar@localhost ~]$ rm -rfv /tmp/test_sticky/test_setgid
    removed directory: `/tmp/test_sticky/test_setgid`
    

Launch Aquamacs from within a shell

Problem

  • Opening Aquamacs from the shell (/Applications/Aquamacs.app/Contents/MacOS/Aquamacs) throws many deprecation warnings.
  • Multiple instances of Aquamacs are appended to the dock when launched, and it is annoying.

Solution

  1. Open Aquamacs.app
  2. Click “Tools
  3. Click “Install Command Line Tools

Resolution

The path to Aquamacs after installing the command line tool package is /usr/bin/aquamacs

Building CATAPACK on OS X

What?

CATAPACK is a package for management and manipulation of photometric catalogues of stellar fields, particularly suitable for the determination of accurate astrometric solutions [...] (ref)

Download

I could not find a copy of CATAPACK on the internet, however the source code is GPLv2, so I have decided to provide the software on this site.

Link: CataPack-2.1.19.tar.gz

Update (06/19/2014)

CataPack found itself a maintainer and a minor revision bump as well…

Main site: CataPack
Link: CataPack-2.2.4.tar.gz

Building

Prerequisites

Dependencies

Compiling

Install dependencies

sudo port install gsl gtk-engines wcslib wcstools

Unpack the source

tar xf CataPack-2.1.19.tar.gz
cd CataPack-2.1.19

Remove the usage of ancient RSML entries

/usr/bin/sed -i '' \ 
-e 's|.so|#.so|' \ 
src/CataCal \ 
src/CataCal.in \ 
src/gsc2 \ 
src/gsc2.in

Build the source

./configure --prefix=/usr/local
make

Install CATAPACK

sudo make install

 

Remote filesystems and mlocate

Have you ever worked in a clustered environment that provided remote home directories?  Furthermore, have you ever been annoyed by the fact mlocate does not descend into these remote file systems by default?  Me too.  I’ve written two small BASH scripts to address this problem.

updatedb_extern

#!/bin/bash

# Edit DEST to point to a local directory you have write access to
DEST=/path/to/local/database/directory

#Don't edit below this line (no point)
EXTERN=( "$@" )
DATABASES=()

#Did we receive any paths to process?
if [ -z "$EXTERN" ] ; then
	echo "No path(s) specified."
	exit 1
fi

#Simple adapation logic to ensure we will have a writable data area
if [ ! -d "$DEST" ] ; then
	mkdir -p $DEST 2>/dev/null
	if [ $? != 0 ] ; then
		DEST=/var/tmp/mlocate
		mkdir -p $DEST 2>/dev/null
		if [ $? != 0 ] ; then
			echo "No suitable path to store locate database."
			exit $?
		fi
	fi
fi

#Generate database names based on external path
#Example: /home/myuser becomes _home_myuser.db
for path in "${EXTERN[@]}"
do
	database=$(echo $path | sed -e "s|/|_|g")
	DATABASES[${#DATABASES[*]}]="$DEST/${database}.db"
done

#For each external path generate an mlocate database
MAX=${#EXTERN[@]}
for (( i=0; i<$MAX; i++))
do
	i_fake=$(( i + 1 ))
	path=${EXTERN[$i]}
	dbpath=${DATABASES[$i]}
	echo "[$(echo $(( $i_fake * 100 / $MAX )))%] $path -> $dbpath"
	updatedb -l 0 -o $dbpath -U $path
done
echo

The developers of mlocate were nice enough to implement the environment variable, LOCATE_PATH, to extend the database search path.  We’re going to use it to our advantage with the following script.

updatedb_extern_setup

#!/bin/bash

DEST=$1
if [ -z "$DEST" ] ; then
	echo "No database path specified."
	exit 1
fi

for db in $DEST/*.db
do
	DELIM=":"
	if [ -z "$LOCATE_PATH" ] ; then
		DELIM=
	fi

	LOCATE_PATH=${LOCATE_PATH}${DELIM}${db}
done
echo "$LOCATE_PATH"

 

Example Usage

In your ~/.bashrc or ~/.bash_profile (or ~/.profile):

export LOCATE_PATH=`update_extern_setup /path/to/local/database/directory`

In your crontab (e.g. crontab -e):

* 2 * * * updatedb_extern /home/username /some/remote/path

Manually:
updatedb_extern /home/username /some/remote/path