On Clouds, Poems, Python and more...: scipy

Showing posts with label scipy. Show all posts

Wednesday, November 7, 2012

Calling IDL, GDL, and FL from Python

Thanks to Anthony Smith's pIDLy module, accessing IDL, and its clones GDL and FL from within Python becomes really easy.

Launch an IPython session, then typing these example lines to see pidly in action:

I1 import pidly

I2 idl = pidly.IDL()

I3 fl = pidly.IDL('fl', idl_prompt='FL> ')

I4 gdl = pidly.IDL('gdl', idl_prompt='GDL> ')

I5 idl.findgen(10)

O5 array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=float32)

I6 fl.findgen(10)

O6 array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=float32)

I7 gdl.findgen(10)

O7 array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=float32)

I8 np.arange(10, dtype='float32')

O8 array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=float32)

Testing these with:

I2 pidly.__version__

O2 '0.2.4+'

I3 np.__version__

O3 '1.8.0.dev-82c0bb8'

$ idl

$ gdl # the version that ships with Fedora 16

GDL - GNU Data Language, Version 0.9.2

$ fl # latest development snapshot (10/Oct/2012)

Example access to an IDL procedure:

I5 idl.pro('cgplot', np.sin(np.linspace(0, 6*np.pi, 1000)))

I6 idl.pro('iplot', np.sin(np.linspace(0, 6*np.pi, 1000)))

For more information, just read the quick usage guide on pidly's web-site, or clone the source from https://github.com/anthonyjsmith/pIDLy

Monday, December 26, 2011

NumPy 1.5 Beginner's Guide review

This new NumPy book provides an easy start to NumPy for those who want to churn numbers in Python programming language, whether you don't have much programming background or you are switching from your favorite language. The author of the book uses quite a friendly tone throughout the book. Most of the important aspects of NumPy is well covered with well explained examples. The examples provided are step by step explained, starting from the basic array/matrix creation to more complex tasks like signal analysis and linear algebra related calculations.

To get best out of this book, it is recommended that you would try out the examples and challenge yourself with the exercises given in have a go sections of the book. If you are feeling lazy for some reason, you can get the source code from the book's page. I finished perusing the book in about a week (having a few years of NumPy experience has definitely role in this). For me reading and understanding through stock market related examples were a bit boring, but if you are in to financial business you might enjoy putting NumPy in your toolstack with the help of this book. Notably, besides the basic NumPy beginners chapters, the book extends into basic Matplotlib and SciPy lands. A lot of examples use Matplotlib to create plots and better illustrate the operations at hand (e.g. curve fitting, statistical distributions) It is a bit surprising to see MaskedArray module didn't get any mention in the book. However, with the basics you gained, it should be fairly easy to start experimenting with masked array functions of NumPy.

Overall, if you are looking for a book to get started in NumPy in about 200 pages, you might give this one a chance. It is available in both print and electronic formats. Further on, you can try their advanced matplotlib and and Sage books, if you are willing to enhance your scientific Python skills.

Final Note: I received a review copy from the publisher. Thanks to Packt publishing for their contributions to open-source literature and providing me a free copy of the book.

Friday, October 29, 2010

Seminar on clouds

Yesterday, I gave a seminar talk to our department faculty and students, titled as "An Aerosol - Cloud Condensation Nuclei and Cloud Droplet Concentration Closure Study" that is summarizing a majority of my thesis related work on aerosol and cloud micro-physical research, starting and finishing with the presentation slides shown above and at the very end of this post.

One of the most exciting parts of the talk was after I finished demonstrating my research results and showed a few of the technicalities of the research. First of all, thanks go to Dr. Jefferson R. Snider for providing his IDL-based parcel model code and being patient with me while I was re-writing the model in Python. Both versions of the code could be accessed through ccnworks by browsing the source tab, and looking under the thesis folder or alternatively, directly contacting to Dr. Snider to get the latest updated version of the model.

When I was working on the model conversion, at one point I had stuck getting similar outputs from the two languages using the same initial conditions, and with the almost identically progressing code. (Some might object on the definitions of two conditions' being "the same" and "almost identical" in the land of high-precision floating point arithmetic.) Continuing in my numerical instability problem, I had created a simple animation to better demonstrate the issue I was experiencing. As some eyes might easily catch, the droplets in parcel model reaching to so bigger sizes comparing to the results produced by IDL, solely based on condensational growth theory. It took a while to find and correct the variable that was updating its original reference while it was supposed to be operating on a copy of the content instead. This issue and a few other discrepancies were fixed in around May 2010. As a personal important todo note at this point is to restructure the parcel model code so that it behaves as an external library, thus making all the important thermodynamical and cloud micro-physics related functions greatly re-usable. This is when I figure passing dictionaries (or changing the structure of the code) to outer scope of the original file that they are declared in.

The final technicality demonstration comes from the work of Dobashi et. al., (2008) (Their "Feedback Control of Cumuliform Cloud Formation based on Computational Fluid Dynamics" titled work.) Those are by far the most realistic cloud appearances I have ever seen in a physical cloud formation processes applied simulation study. Although cloud modellers are usually interested in studying temporal and spatial statistical properties of clouds rather than synthesizing their realistic shapes and appearances, such a similar work would have a great educational use in visualising and thus better demonstrating the physical processes governing the cloud formation and further precipitation development. I can say that the authors should have prototyped their simulations using a C-like language in order to make low level access to CPU/GPU because of the high computation demands, but there are fast computation techniques available in Python that would make this undertaking possible to be realized be it simulating a research flight or playing in between sequences of a simulation to probe important properties of the formation.

For the curious, I share high-quality PDF version of the slides. Note that I had to take out a couple pages from this original presentation in order to fit 45 minutes allocated presentation time, and it was unfortunate that one of the expected audience was not in the classroom to see a few indistinguishably minute research unrelated points. All analyses and plots in my slides are performed/created using Python and its scientific tool-stack with some additional annotation help from OpenOffice on the slides using DejaVu-Sans font; for the first time with full consistency. There is one exception to the pure Python code, that is when the Python based model calculations taking up five to ten times more than the original IDL-based code, and that is where exactly Cython magic were sought and applied to boost million-times executed functions. Here, I greet the community once more, for making the scientific computation freely, publicly, and easily available to everyone without putting any restrictions in both code and verbal discussions level as how this core work should really be in its very essence within the science endeavor.

I also salute Matthew Turk and his work for that fantastic presentation style even though I couldn't copy his great architectural design throughout my slides. Yet not him, I have other masters around me for letting me embrace Tufte's sayings :)

My work on clouds is on-going and leading me to a PhD level research after successful completion of my Masters thesis within the next month. Feel free to say hi or ask more clouds if you share any research similarities.

Now, it is time for some abstraction: Youth Uprising - Rarefaction

Please keep up imagination and creativity flames are burning all along...

Friday, April 16, 2010

Celebrating the first year of IPython logging

It was last year today when I first started logging my IPython sessions explicitly using Pierre Raybaut's idea. All you need to do is just to make the changes/additions described in this piece of documentation (Logging to a file) on its soon-to-be-change LP repository. (Applies to IPython 0.10 and below)

You get a time-stamped log file (in your ~/ipython/ or wherever your IPython home directory is set at) created per day that looks like below:

#!/usr/bin/env python
# 2009-04-16.py
# IPython automatic logging file
# 13:15
# =================================
d = loadtxt(file, skiprows=30)
plot([d[i][8:] for i in range(12)])
# =================================
# 14:08
# =================================
boxplot(d[:][8:])

As of writing this entry I count almost 300 separate logs and combining them into one file using this little script yields about 37.5 k-lines (including lots of multiple entries, time-stamps, empty comments, many copy-paste codes that I haven't actually typed in)

Besides having this combined file as a rough measure for myself there is another good use for it as triggered with this question: How to exit IPython properly? IPython internal history file forgets what was in the session if you accidentally or intentionally kill your IPython session without issuing an Exit at the exit :) That new combined history file comes to our help.

First we will append all the time-stamp logs into one file (rename it to "history" so that IPython can load it at the start-up) Then from iplib.py comment the readline.set_history_length(1000) line to prevent 1000 lines limit in your history file. Now I can access all my previous coding history from within IPython again no matter how I end my sessions. (Providing that I will stitch my logs periodically)

Lazy coding at its best!

It would be great if IPython could handle history lines more smartly to read multiple lines back properly. Who knows maybe an IPython super user has a solution for that laziness as well.

By the way anyone knows how to remove duplicate lines from a file without actually sorting it?

Monday, April 5, 2010

GVIM+IPython with Conque plug-in

Here is an alternative way to bridge (G)VIM and IPython applications. GVIM is my editor-of-choice and IPython is a great interactive Python interpreter. If you are a user of these two environments, this simple integration technique could greatly boost your programming / prototyping speeds.

First go to conque (Thanks to Nico Raffo for the plug-in and helping me to include IPython functionality in it) page and install the plug-in following the simple installation instructions. (In my Fedora 12 system, I pulled the latest tar.gz package from the download list section and extracted it under ~/.vim folder.) Next grab the latest conque_term_pylab.vim file from the same section and place under ~/.vim/plugin directory.

When you open a simple Python script you can easily launch an "IPython -pylab" instance by hitting F6 and execute the whole script content by F8, use F9 to send a visual selection. The IPython inside the buffer acts as a part of GVIM, so you can easily switch between buffers and copy/move text from/to buffers. You can modify the IPython switch, default window position, mapped keys editing conque_term_pylab.vim

The screenshot below shows the bridge in action:

Hint: Add the following line into your vimrc file to get equally spaced buffer windows independent of the main GVIM window size.

autocmd VimResized * wincmd =

Currently only %run magic of IPython is implemented. Here are some of the ideas that could be added to improve this IPython + VIM integration:

Add %whos key-mapping
Launch IPython -pylab on start-up automatically if a Python script is opened.
If "run" command is sent before an IPy launched instantiate one automatically.
Make sure only one instance of IPython is running!
GUI integration, create GVIM menus, reading %whos values back into a separate window and changing values.

These last points could be well achieved by working on PIDA project. It is very possible to have an VIM + IPython powered IDE that is especially suited towards scientific use.

Tuesday, March 16, 2010

Where you can't code...

Please don't have the feeling that a research aircraft isn't one of the places where you can't code comfortably :) This blog is more to do with post-philosophical investigations of the title.

This is one of the times that my programming literacy is to no avail to perform the research task ahead of me.

The problem is simply defined: Find the time-ranges where the research aircraft was sampling at cloud-bases.

What about the data-set?
About 37 hours of airborne data-set from the Saudi Spring 09 atmospheric measurement campaign. It contains aerosol and cloud micro-physical data and atmospheric state parameters from at least 10 different probes, listing over 100 different measured variables (e.g. pressure altitude, cloud condensation nuclei concentration, 2D images of ice-cloud particles.) We also recorded some of the flights from take-off to landing for nostalgia purposes and mainly aiding us while performing post-flight studies.

I get these ideas half-way through my manual exploration of the data-set. Many measurements are helpful in this analysis (e.g. air-dew point temperatures to determine where clouds are forming, the state of the aircraft in a given time-series plot, amount of liquid water content to distinguish in and out of cloud conditions, and most valuably in my opinion is visual observation from the recorded videos) helping me to infer when actually the aircraft was sampling right underneath a cloud in a level path.

This is part of my job and at the core of my thesis work. Even tough I complain little about the situation I get paid for what I am doing right now. My complaints are mostly for, in spite of all that rich data-set I am the one eventually making that final informative decision after manually and cautiously going through the data at hand. Far from being generic or universal. Good luck to myself if I need to extend the analysis for another airborne data-set :) I wish I had taken much wiser notes instead of trying to spot the most interesting occurring cloud of the day.

To complicate the analysis to a bit further level: not only find the cloud-base measurements also find the consecutive vertical passes of that same cloud in an automated fashion. No, not the one on its left nor the one on the right.

I am counting months backward for my graduation. Probably I won't have much time to see a breakthrough in AI research till then as mentioned in this article: How Long Till Human-Level AI?

Monday, March 15, 2010

Simple animation of UWyo CCN counter chamber

After modifying my Random particles in a cube example with Mayavi's @animate decorator I can get this simple animation which I use to demonstrate particle growth inside the device chamber.

The Chemistry department of our school hosting a two-days Air Pollution Workshop for the neighbourhood high-school students. As a part of "Aerosol Particles" lab I am giving a demonstration about "Cloud Condensation Nuclei (CCN)" and University of Wyoming designed CCN counter.

The example is really the simplest it gets since it is designed for high-school students to little better appreciate the particle growth in a sealed chamber. There are many improvements could be made on this animation to make things much closer to what they are in reality. Such as:

Particle locations should be depicted in a cylindrical space inside of a cube as given in this example.
Animation could be run in parallel with the counter by simply reading the status of the device through its serial port. (i.e., reading TWAIT, FLUSH, CCN_DETC status)
A GUI could be added to control parameters of the instrument. (e.g., setting its supersaturation)
Hot top-plate and cold bottom-plate could be drawn on newly created cylindrical chamber and supersaturation could be modelled and shown.
Different continuous distributions could be added. (For now the particles are gamma distributed -using NumPy's gamma distribution function.)
Instead of growing all the particles activation could be modeled more realistically (i.e. simply via κ-Köhler theory.) and applying condensational growth equation.
Particle interractions could be included (e.g. gravitational settlings, collisions in between particles etc...)
Particle detection could be modelled employing the Mie scattering theory.

This list could be expanded with many other considerations. Some are easy to implement, some really takes a while to achieve. Before I move further it is a good idea to read Dr. Snider's Supersaturation in the Wyoming CCN Instrument article while I have chance to spend time with the counter. In case you are curious about the code just follow the ccnanim.py link. Yes, all this buzz have been 33 lines of code. That's what you get when start coding with Python.

The bottom left photo is courtesy of Dr. David Delene (The photo should have been taken in one of Mali precipitation enhancement campaigns.) The bottom right one is courtesy of Dr. Manfred Wendisch in his Airborne Physical Measurements: Methods and Instruments lecture notes.

Thursday, March 11, 2010

Py4Science @ University of North Dakota

On my second attempt to popularize Python and its scientific computing ecosystem I volunteered to introduce interested people in an interactive tutorial session as a part of the University of North Dakota 2010 Scholarly Forum program. The Graduate School kindly helped and supported me for this one-of-a-kind interactive presentation in the Forum history.

The session was advertised on the Scholarly Forum web-page and e-mails were circulated throughout the campus spreading the session information. The first half-hour of the tutorial was spent installing and demonstrating the PythonXY, Enthought Python Distribution [EPD] and Sage via the notebook interface. After introducing the Python language we briefly went over the most fundamental members of the Py4Science family; IPython, NumPy, SciPy, Matplotlib, SymPy, and Enthought Tool Suite [ETS]. Along with the technical demonstrations I presented how developments are made in open-source environments and try to emphasize the blurring line of user-developer distinction in Py4Science habitat. I provided some selected resources for the attenders to their further study. I finished the tutorial by showing some advanced level examples demonstrating the powers of Python in scientific computing. Lastly, I invited everyone to the upcoming SciPy10 conference that will take place in Austin, Texas starting on June 28.

You can access the slides of the tutorial by clicking this link.

Acknowledgements:

Although the original title of the tutorial was "Python and Scientific Computing in Open-Source" I have chosen to use Py4Science name by following the Fernando Pérez's tradition. I am thankful to my faculty advisor David Delene for joining us and introducing me and telling us an exciting Python news from the European Aerosol-Cloud Research [EUCARI] group. Susan Caraher of the Graduate School helped me with some of the logistics of the session and Vicki Thompson of the Continuing Education department provided eight PythonXY installed laptops for our use. Thanks to Enthought for allowing us to use multiple copies of their EPD. Finally, I am indebted to Jarrod Millman for providing me some of the t-shirts from the previous SciPy conferences. His kind jest inevitably doubled the joy of my teaching experience.

On Clouds, Poems, Python and more...