Apptainer cache location
I was trying to figure out where [Apptainer][apptainer] caches images in its SIF format created on the fly from OCI images pulled from a remote registry.
The location appears to be under ~/.apptainer/cache/oci-tmp/, as can be seen if you explicitly convert a Docker image to a SIF file at a known path (here ~/somefile.sif) then search for another file with the same checksum under ~/.apptainer:
[someuser@node001 [stanage] ~]$ apptainer run docker://alpine:latest whoami INFO: Using cached SIF image someuser [someuser@node001 [stanage] ~]$ apptainer pull somefile.
Apptainer cache validity
I was recently asked whether Apptainer (forked from the Singularity project) will definitely update its cache of Apptainer images created from OCI/Docker images if a tagged image changes on a remote repo. Let’s find out:
Let’s start by creating a v. simple image that just sends a message to stdout when executed. After creating the image we tag it so we have an identifier to push to a remote image registry.
Arch Linux: where have my drivers gone?
I’ve been running Arch Linux on my personal and work laptops for a couple of years and love it: it’s very configurable, cutting edge, (surprisingly) stable and has awesome documentation (useful for all Linux users).
However, there’s one aspect of it that really bugs me: files related to the current kernel and kernel modules (inc. device drivers) are aggressively purged as soon as you upgrade the linux kernel package i.e. the kernel modules built against the running kernel are no longer on disk so cannot be dynamically loaded.
Book review: Ed Mastery by Michael W Lucas
Why learn Latin in 2018 if you’re not a member of the catholic church? Or Greek if you’re not greek? Well, both are great at putting things in context and highlighting commonality: Latin does this for the romance languages and Greek does this for much scientific terminology.
And, Ed, a 46-47 year old positively archaic text editor, does this for parts of the Unix userland that are still very much in common use (specifically sed, awk, grep, vi), as I learned from reading Michael W Lucas’ new book Ed Mastery.
Running job arrays on HPC using Singularity and a Docker image
I thought I’d share a little success story.
A researcher recently approached me to say he’d been having difficulty getting his bioinformatics workflow (based on this) working on the University of Sheffield’s ShARC HPC cluster and had an urgent need to get it running to meet a pending deadline. He’s primarily using the GATK ‘genome analysis toolkit’.
He appealed for help on the GATK forum, members of which suggested that the issue may be due to the way in which GATK was installed.
Running Abaqus 2017 within a Singularity container (with hw-accel graphics)
When I started as a Research Software Engineer at the Uni of Sheffield a year ago I was given a lovely Dell XPS 9550 laptop to work on. The first thing I did was to install Arch Linux on it, which has so far proved to be extremely stable despite the rolling release model and the main Arch repository offering very recent versions of most FOSS packages.
Large engineering apps say no to bleeding edge Linux distros However, the one main issue with running Arch at work is that some commercial engineering software supports Linux but not all flavours: there are a fair few commercial packages that are only supported and only really work with RHEL/Centos/SLES and Ubuntu.
When will my Grid Engine job run?
After submitting a job to a Grid Engine (GE) cluster such as ShARC, GE iterates through all job queues instances on all nodes (in a particular order) to see if there are sufficient free resources to start your job. If so (and GE isn’t holding resources back for a Resource Reservation) then your job will start running.
If not, then your job is added to the list of pending jobs. Each pending job has an associated dynamically-calculated priority, which the GE scheduler uses to determine the order in which consider pending jobs.
Submitting jobs to remote Grid Engine clusters using the Paramiko SSH client
HPC clusters running the Grid Engine distributed resource manager (job scheduling) software allow jobs to be submitted from a whitelist of ‘submit hosts’. With the ShARC and Iceberg clusters here at the University of Sheffield the per-cluster lists of permitted submit hosts include all login nodes and all worker nodes; for ease of management and security, unmanaged hosts (e.g. researchers’ workstations) are not added to the list.
If you really want to be able to automate the process of submitting jobs from your own machine then one option is to write a script that logs into the cluster via SSH then submits a job from there.
It hatches…
I’m Will Furnass, a Research Software Engineer in the University of Sheffield’s Research Software Engineering team.
My boss, Mike Croucher, is a big believer in blogs as a means of getting ideas out there and has been prodding me for some time about setting up my own blog to talk about research software, systems administration and teaching so I thought I’d start 2018 by finally doing just that.
‘Learning Patterns’ - why the name?