Adventures in Signal Processing and Open Science

Category: Software tools

Revisiting GPU Computing

A few years ago (ah well, I guess back in 2011…) I started experimenting with scientific computing on GPUs. The research project I was working in had equipped a couple of quite powerful servers with as many NVIDIA Tesla C1060s as we could cram in there.

Back then it was a lot of work getting algorithms to run on GPUs. First I would have to install the drivers for the GPUs manually which required some detective work to find the right configuration files to edit. Then I would have to install the CUDA toolkit manually. Once that worked, I could start writing code for it. As some may know, I like doing my computing in Python (for example this). Before that, I was an enthusiastic Matlab user, and ArrayFire (back then they were called Accelereyes) offered a very nice solution – Jacket – making it very easy to perform computations on GPU in Matlab. Unfortunately that solution was discontinued.

This was around the time I was starting to use Python instead. As far as I recall, PyCUDA was more or less the only option at the time to access the GPU from Python. This was a bit challenging as you would have to write your own kernels in CUDA C to be plugged into Python. Developing software for the GPU in CUDA C was way less efficient than Python coding. On top of that, things had to be optimised quite specifically for a particular GPU architecture. With each new generation of GPU, details changed quite drastically and your existing code would run inefficiently or not at all on newer GPUs. This made it too challenging to keep up and I decided to focus on more efficient code development in Python (when I say more efficient – I don’t mean in terms of execution time, I mean in terms of development time) and quietly mothballed my GPU computing.

Fast forward to today. A lot has happened since and the newest generations of NVIDIAs GPUs make the good old ones I was experimenting with almost ridiculous. Not least the explosion of research in and applications of deep neural networks has resulted in several high-quality software libraries for computing on GPUs. Most of these software libraries seem to be quite high-level, meaning that you can interface to the GPU and execute various operations on it at a high abstraction level. This includes simply calling functions directly in Python.

The emergence of new, high-level tools for GPU computing in Python (among other) has convinced me that the time is ripe for giving GPU computing another go. So I went and bought a GeForce GTX 1080 Ti for my office workstation to get back in the game with some newer hardware. The Tesla C1060s from back in the day were GPUs aimed for scientific computing and especially later generations of the Tesla line focused on getting good double precision floating point performance. The GeForce card here is a gaming card and relatively less powerful in double precision than single precision, but the newer Tesla cards are much too expensive, so I chose a GeForce card to keep the cost down.

Over the next few weeks I am going to be experimenting with various possibilities for interfacing to the GPU from Python. Luckily, this has become a walk in the park compared my earlier attempts:

  1. First I installed the NVIDIA GPU driver from this PPA (I am running Ubuntu). This seems a quite stable archive that does not resort to installing all sorts of unstable, bleeding-edge packages on your system. For me, it just worked out of the box without any manual configuration file editing. Wonderful!
  2. Since I use Continuum Analytics’ Anaconda distribution for all my Python needs, it is very convenient that it can also install the CUDA toolkit:
    conda install cudatoolkit
    And this also worked out of the box for me.

So, the first library I will be trying out is Numba: https://devblogs.nvidia.com/seven-things-numba/. Stay tuned for experiments and don’t hesitate to let me know of any great packages/toolboxes/libraries you think I should try.

Advertisements

Seriously, where is the source code?

Update: I sent this comment to the program committee of a conference as response to a recent review. It does not matter which conference; it goes for most of the conferences I am familiar with…

Dear program committee of conference X,

How is it that in this day and age you are still letting authors submit papers without disclosing their computational scripts? Most modern papers in our field rely heavily on computational methods and without being able to see the actual implementation it is impossible to assess whether the results are worth anything at all. Without seeing the actual code, “we used CVX” could mean just about anything; for example that the authors might not be solving the optimisation problem they think they are solving. I think it is downright frivolous to think that we can still get away with letting scientific research papers be superficial advertisement for the real scholarship that is tested in the computational code, but hidden away so that no-one has any chance of assessing the actual substance of the results being advertised. Further, asking reviewers to spend our time on it seemingly without considering this borders on being rude.

Magni 1.7.0 Released

A new version of the Magni software package was just released on the 1st of March. The previous release (1.6.0) introduced approximate message passing (AMP) and generalised approximate message passing (GAMP) reconstruction algorithms. This time we are extending the functionality of the GAMP algorithm to include weighted sparse priors. This effectively means that you can model sparse signals with non-identically distributed entries.

As far as I know, this way of modelling sparse signals in GAMP reconstruction are not part of any existing algorithms and will be described in further detail in an upcoming paper.

This new feature in GAMP can be found in the magni.cs.reconstruction.gamp module, more specifically magni.cs.reconstruction.gamp.input_channel.GWSdocumentation.

If you are not familiar with the Magni package and are interested in compressed sensing and/or atomic force microscopy, we invite you to explore the functionality the package offers. It also contains various iterative thresholding reconstruction algorithms, dictionary and measurement matrices for 1D and 2D compressed sensing, various features for combining this with AFM imaging, and mechanisms for validating function input and storing meta-data to aid reproducibility.

The Magni package was designed and developed with a strong focus on well-tested, -validated and -documented code.

The Magni package is a product of the FastAFM research project.

Download

  • The package can be found on GitHub where we continually release new versions: GitHub – release 1.7.0 here.
  • The package documentation can be read here: Magni documentation
  • The package can be installed from PyPI or from Anaconda.

Magni 1.6.0 released

Our newest version of the Magni software package was just released on the 2nd of November. This particular release has some interesting features we (the team behind the Magni package) hope some of you find particularly interesting.

The major new features in this release are approximate message passing (AMP) and generalised approximate message passing (GAMP) estimation algorithms for signal reconstruction. These new algorithms can be found in the magni.cs.reconstruction.amp and magni.cs.reconstruction.gamp modules, respectively. Note that the magni.cs sub-package contains algorithms applicable to compressed sensing (CS) and CS-like reconstruction problems in general – and not just atomic force microscopy (AFM).

If you are not familiar with the Magni package and are interested in compressed sensing and/or atomic force microscopy, we invite you to explore the functionality the package offers. It also contains various iterative thresholding reconstruction algorithms, dictionary and measurement matrices for 1D and 2D compressed sensing, various features for combining this with AFM imaging, and mechanisms for validating function input and storing meta-data to aid reproducibility.

The Magni package was designed and developed with a strong focus on well-tested, -validated and -documented code.

The Magni package is a product of the FastAFM research project.

Download

  • The package can be found on GitHub where we continually release new versions: GitHub – release 1.6.0 here.
  • The package documentation can be read here: Magni documentation
  • The package can be installed from PyPI or from Anaconda.

Thoughts about Scholarly HTML

The company science.ai is working on a draft standard (or what I guess they hope will eventually become a standard) called Scholarly HTML. The purpose of this seems to be to standardise the way scholarly articles are structured as HTML in order to use that as a more semantic alternative to for example PDF which may look nice but does nothing to help understand the structure of the content, probably more the contrary.
They present their proposed standard in this document. They also seem to have formed a community group at the World Wide Web Consortium. It appears this is not a new initiative. There was already a previous project called Scholarly HTML, but science.ai seem to be trying to help take the idea further from there. Martin Fenner wrote a bit of background story behind the original Scholarly HTML.
I read science.ai’s proposal. It seems like a very promising initiative because it would allow scholarly articles across publishers to be understood better by, not least, algorithms for content mining, automated literature search, recommender systems etc. It would be particularly helpful if all publishers had a common standard for marking up articles and HTML seems a good choice since you only need a web browser to display it. This is also another nice feature about it. I tend to read a lot on my mobile phone and tablet and it really is a pain when the content does not fit the screen. This is often the case with PDF which does not reflow too well in the apps I use for viewing. Here HTML would be much better, not being physical page-focused like PDF.
I started looking at this proposal because it seemed like a natural direction to look further in from my crude preliminary experiments in Publishing Mathematics in e-books.
After reading the proposal, a few questions arose:

  1. The way the formatting of references is described, it seems to me as if references can be of type “schema:Book” or “schema:ScholarlyArticle”. Does this mean that they do not consider a need to cite anything but books or scholarly articles? I know that some people hold the IMO very conservative view that the reference list should only refer to peer-reviewed material, but this is too constrained and I certainly think it will be relevant to cite websites, data sets, source code etc. as well. It should all go into the reference list to make it easier to understand what the background material behind a paper is. This calls for a much richer selection of entry types. For example Biblatex’ entry types could serve as inspiration.
  2. The authors and affiliations section is described here. Author entries are described as having:

    property=”schema:author” or property=”schema:contributor” and a typeof=”sa:ContributorRole”

    I wonder if this way of specifying authors/contributors makes it possible to specify more granular roles or multiple roles for each author like for example Open Research Badges?

  3. Under article structure, they list the following types of sections:

    Sections are expected to be typed using the typeof attribute. The following typeof values are currently understood:

    sa:Funding (which has its specific structure)
    sa:Abstract
    sa:MaterialsAndMethods
    sa:Results
    sa:Conclusion
    sa:Acknowledgements
    sa:ReferenceList

    I think there is a need for more types of sections. I for example also see articles containing Introduction, Analysis, and Discussion sections and I am sure there must be more that I have not thought of.

Publishing mathematics in ebooks – part 1

This is the first part of what I hope will be a series of posts on my explorations of how to author maths-heavy writing in ebook format.

I have for quite some time now been annoyed with PDFs on mobile phones and tablets. Although there are some fine PDF viewers avaible, it usually still takes a lot of annoying scrolling to read a scientific paper on my phone or tablet. On the other hand, I have recently read a few novels as ebooks on my phone and my tablet and this has been an entirely different, enjoyable experience. The main difference is that the text in ebooks is re-flowable so as to make it easily adaptable to the screen size and preferred font size. This makes ebooks seem like a promising choice as an alternative to PDF for distributing scientific papers in more screen-friendly format. There is just one hurdle: mathematicsRead the rest of this entry »

Teaching with the IPython Notebook

I have been teaching introductory Python for modelling and simulation and for scientific computing for a couple of years now. I am still somewhat new to Python myself, having “converted” from Matlab a couple of years ago. I find the open approach of using free and open source software instead of expensive proprietary software very motivating and I was easily talked into using it by my colleagues and quickly decided to base my teaching on it as well.
Read the rest of this entry »

Magni: A Python Package for Compressive Sampling and Reconstruction of Atomic Force Microscopy Images

Our new software metapaper Magni: A Python Package for Compressive Sampling and Reconstruction of Atomic Force Microscopy Images has just been published in Journal of Open Research Software. The paper describes our new software package Magni:

Magni is an open source Python package that embraces compressed sensing and Atomic Force Microscopy (AFM) imaging techniques. It provides AFM-specific functionality for undersampling and reconstructing images from AFM equipment and thereby accelerating the acquisition of AFM images. Magni also provides researchers in compressed sensing with a selection of algorithms for reconstructing undersampled general images, and offers a consistent and rigorous way to efficiently evaluate the researchers own developed reconstruction algorithms in terms of phase transitions. The package also serves as a convenient platform for researchers in compressed sensing aiming at obtaining a high degree of reproducibility of their research.

The software itself is on GitHub as well as on Aalborg University’s repository: DOI 10.5278/VBN/MISC/Magni

Go ahead and check it out if you are into compressed sensing or atomic force microscopy. Pull requests welcome if you have ideas.

Compressed Sensing – and more – in Python

Compressed Sensing – and more – in Python

The availability of compressed sensing reconstruction algorithms for Python has so far been quite scarce. A new software package improves on this situation. The package PyUnLocBox from the LTS2 lab at EPFL is a convex optimisation toolbox using proximal splitting methods. It can, among other things, be used to solve the regularised version of the LASSO/BPDN optimisation problem used for reconstruction in compressed sensing:

\underset{x}{\mathrm{argmin}} \| Ax - y \|_2 + \tau \| x \|_1

See http://pyunlocbox.readthedocs.org/en/latest/tutorials/compressed_sensing_1.html

Heard through Pierre Vandergheynst.

I have yet to find out if it also solves the constrained version. Update: Pierre Vandergheynst informed me that the package does not yet solve the constrained version of the above optimisation problem, but it is coming:

\underset{x}{\mathrm{argmin}} \quad \| x \|_1 \\ \text{s.t.} \quad \| Ax - y \|_2 < \epsilon

DAT versioned data

DAT versioned data

I just came across this presentation shared by Karthik Ram on Twitter (see also http://inundata.org/2013/02/28/version-control-for-science/). It describes a project that tries to create a sort of git for data. It seems to be at a very early stage yet, but looks very interesting.

Forest Vista

seeking principles

Academic Karma

Re-engineering Peer Review

Pandelis Perakakis, PhD

Academic Website

chorasimilarity

computing with space | open notebook

PEER REVIEW WATCH

Peer-review is the gold standard of science. But an increasing number of retractions has made academics and journalists alike start questioning the peer-review process. This blog gets underneath the skin of peer-review and takes a look at the issues the process is facing today.

Short, Fat Matrices

a research blog by Dustin G. Mixon

www.rockyourpaper.org

Discover and manage research articles...

Science Publishing Laboratory

Experiments in scientific publishing

Open Access Button

Push Button. Get Research. Make Progress.

Le Petit Chercheur Illustré

Yet Another Signal Processing (and Applied Math) blog