Python basics

Table of Contents

Introduction

NanoLanguage is a dialect of Python, and all the standard features from Python is available when you invoke ATK. This implies that when you construct and build up your NanoLanguage scripts, you'll have all the power of Python behind you.

If you are not familiar with Python there are many good tutorials available on the web, such as

If this is your first encounter with NanoLanguage and Python, we recommend that you keep the Python tutorial and this Python basics page handy as you start using NanoLanguage. The spectrum of features offered by Python is enormous, and a lot of them will not be needed when you write scripts in NanoLanguage. The minimum set of structures you really should know about when using NanoLanguage is

  • Code indentation

  • Comments in Python

  • Modules

  • Lists

  • Tuples

  • Dictionaries

  • For loops

  • Objects

  • Functions and arguments

In the next few sections, we'll discuss the basic usage of the above Python concepts and some general Python features.

Indentation

One important point you must know, before you embark on writing your first NanoLanguage script is that Python relies on indentation when interpreting your script. If your code is not correctly indented, Python will stop executing the provided script and return an error. Precisely, when and how you should indent code in your scripts, will become apparent through the examples in this manual; a brief example, however, showing how to define a function illustrates the point

def myNewMethod():
    print 'Hello World'

The colon “:” and the indentation of the following line tells us that the print statement is a part of the myNewMethod() function. The indentation determines if the code belongs to the defined function or to the remaining code.

It is important to know that using both spaces and tabulation when indenting code sections or statements could mean trouble. The reason for this is that tabulation might not be interpreted the same way in different editors. This could become an issue if you work on the same script using different operating systems or collaborate with others on writing them. Some editors allow you to specify the number of spaces that should be inserted when pressing the TAB key, and we recommend that you use this option when available or simply use the SPACE key for indentation to increase interoperability.

This will do for now, but keep in mind that Python code must be properly indented and never to use both types of indentation in the same script. For a more complete discussion of the indentation rules used in Python see this on-line resource on 'Indenting Code.

Comments

A comment line in Python starts with a the character “#

# This is a comment line in Python
print 'This line will be executed'

The first line is ignored when interpreting the Python script. The second line will print the words

This line will be executed

to the screen.

Longer (multi-line) comments can be made using triple quotes

a = 2
"""    
A value was just assigned to a
We will now assign a value to b
Are you ready?
"""
b = 3
print "a x b = ", b*a

The lines between the triple quotes are ignored by the Python interpreter so the result printed by the above would be

a x b =  6

In Python, it doesn't matter whether you use single quotes (') or double quotes (") for declaring a triple-quoted region.

Importing modules

A Python module is a file containing a collection of functions, classes, and many other tools that initially are not available when Python is invoked. In some sense, you may think of a Python module as a library. You load a Python module by using the import statement. Modules are typically imported in three different ways

  • either by importing the entire module, for example

import math

# Entire math module is now available
x = 3.14
y = math.cos(x)
z = math.sin(x)
  • or by importing specific elements from the module, for example

from math import cos

# Only cos() has been loaded from the math module
x = 3.14
y = cos(x)
  • or by importing all methods from a module

from math import *
 
# All names have been loaded from the math module
x = 3.14
y = cos(x)
z = sin(x)

As mentioned above, a '#' denotes a comment in Python. Everything past this character, but still on the same line, will not get interpreted.

You will find more information on modules in the Python tutorial, whereas an overview of the math module is provided here.

Lists

A list is an object used to collect elements. Lists are created as easy as

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9] 
romans  = ['a', 'b', 'c', 'd']
gases   = ['Hydrogen', 'Helium']

The last of the above examples creates a list containing two strings and saves the defined list in a variable gases. Lists can contain several different data types at the same time, which makes it a very flexible data structure.

Elements in a list are numbered starting from zero – so the first element in the list gases (Hydrogen) can be accessed as

print gases[0]

It is also possible to store different data types within the same list structure

gases = [1, 'Hydrogen', 2, 'Helium']

and extend these with additional elements

gases.extend([7,'Nitrogen']) 
print gases

which produces

[1, 'Hydrogen', 2, 'Helium', 7, 'Nitrogen']

Here we extended elements from another list to the list gases. Had we instead applied the list method append() to the list

gases.append([8,'Oxygen'])
print gases

we get

[1, 'Hydrogen', 2, 'Helium', 7, 'Nitrogen', [8, 'Oxygen']]

So, in this case, the actual list (and not the elements in it!) is added to the gases list. Another (and shorter) way of adding elements to a list is by using the “+” operator:

a = [1,2]
a = a + [3,4] 
print a

which gives

[1, 2, 3, 4]

Additional information on lists can be found here in the Python tutorial.

Tuples

A tuple is constructed very similar to a list, but by using parentheses instead of square brackets

mytuple = ('uno','duo')  # Note the curved parentheses
myothertuple = ('uno', ) # Note the comma just after 'uno'

An important detail in the above example, is that a trailing comma is needed when the tuple only contains a single element; otherwise, it could not be distinguished from an ordinary parentheses construction. For example,

t = ('uno',) # t is a tuple
s = ('uno')  # s is 'just' a string

Contrary to a list, a tuple is immutable, meaning that, once it is defined, its values can not be changed. For example, doing this

mytuple = ('uno','duo') 
mytuple[1] = 'quattro'

is illegal and results in the output

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: object does not support item assignment

Here, Python raises the error

TypeError: object does not support item assignment

since the above assignment is illegal. Python informs that the error occurred on “line 1”. This is the kind of message you would get when using Python interactively. Had we used it in a script, the line number would refer to the actual line in the script.

Combinations of tuples and lists are allowed. To set up a collection of vectors to describe atomic coordinates, we may use lists and tuples in this fashion

atom_coordinate_1 = (0.1, 0.2, 0.3)
atom_coordinate_2 = (0.4, 0.5, 0.6)
atom_coordinate_3 = (0.7, 0.8, 0.9)

collection_of_atoms = [
    atom_coordinate_1,
    atom_coordinate_2,
    atom_coordinate_3
    ]
 
print collection_of_atoms

producing the output

[(0.10000000000000001, 0.20000000000000001, 0.29999999999999999), 
 (0.40000000000000002, 0.5,                 0.59999999999999998), 
 (0.69999999999999996, 0.80000000000000004, 0.90000000000000002)]

Here, we have displayed the output over several lines for aesthetic reasons.

For more details on tuples, consult this section of the Python tutorial.

Dictionaries

Often it can be useful to assign tags or a keys to different values, in order distinguish among these. This can be accomplished in Python by using so-called dictionaries. In Python, a dictionary is called a dict. A dict is created like this

myDict = {'username':'henry','password':'secret'}
print myDict['username']

which yields

henry

In this example, username and henry are a key:value pair. So is password and secret. Note that dictionaries are created using curly braces “{}” (tuples use parentheses “()” and lists use square brackets “[]”).

There is no internal ordering in a dict, i.e. keys and values are not stored in the same order as they are entered into the dict. Values in the dict are accessed via their key. A value can be associated with several keys, whereas a key may be associated with a single value only.

Dictionaries are used a lot in NanoLanguage, for example to store configuration values of parameter setting.

Two frequently used methods associated with a dict are keys() and values(). The method keys() returns a list containing the keys of the dict whereas values() returns the values. For example

myDict = {'username':'henry','password':'secret'}
print myDict.keys()
print myDict.values() 

producing

['username', 'password']
['henry', 'secret']

It is also possible to query a dict regarding its length using the method len()

myDict = {'username':'henry','password':'secret'}
print 'myDict has length', len(myDict)

resulting in the output

myDict has length 2

The return value of len() corresponds to the number of key:value pairs in the dict. You can find more information about dict usage here in the Python tutorial.

For loops

Once we have created lists, it would be nice if we had an automatic way of addressing its individual elements. Python offers this functionality by using a for-loop construction, e.g.

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9]

for x in numbers:
    print x,

which generates the output

1 2 3 4 5 6 7 8 9

Another way to use a for-loop is when constructing iterative loops in numerical algorithms. Here is a simple example using Newton-Raphson iteration for determining √2

x = 20.0
for i in range(8):
    x = x - (x*x - 2.0)/(2.0*x)
    print x

which converges quadratically to √2 as

10.05
5.12450248756
2.75739213842
1.74135758045
1.44494338196
1.41454033013
1.41421360012
1.41421356237

Python has lots of so-called built-in functions, for example the range() function, which we have just used above.

The range() function returns a list containing all positive integers less than the argument (including zero), which was passed to the range(). The for-loop now iterates over all the elements generated by the range(8) call, performing a Newton update of the variable x for each iteration step of the loop. The value of the first element in a list and the increments between neighboring elements can be controlled by calling range() like this

for i in range(9,21,3):
    print i

giving rise to the output

9
12
15
18

It is also possible to find the length of a list (or a tuple) using the built-in len() function. This can neatly be combined with a for-loop to iterate over a list. For example

m = range(6)
for i in range(len(m)):
    print m[i],

produces

0 1 2 3 4 5

So, len() returns the length (i.e. the number of elements) of the list. This is then used to create a new list (using the range() function) over which the for-loop iterates. The comma “,” at the end of the print statement instructs Python to suppress printing a new line character. Otherwise, each number would have been printed on separate lines.

Consult the links for-loop, range() and len() for more info on loop construction in Python tutorial.

Objects

Many of the structures which you work with in both NanoLanguage and Python are so-called objects. An object is a structure, which contains a lot of handy functions for accessing and manipulating the data assigned to the object. These special functions are called methods. Let us see how we work with these in practice. If we define a list like

numbers = [1, 5, 3, 6, 2, 8, 7, 9, 4]

the variable numbers in fact refers to a list object holding the numbers [1,5,3,6,2,8,7,9,4]. A list object contains several helpful methods, one of them being reverse(). You call reverse() like this

numbers.reverse()
print numbers

which gives the output

[4, 9, 7, 8, 2, 6, 3, 5, 1]

Another list method is sort(), which sorts the elements of a list

numbers.sort()
print numbers

producing

[1, 2, 3, 4, 5, 6, 7, 8, 9]

You can always use the built-in Python function dir() to display information about the functionality provided by a given object. For example,

print dir(list)

returns the following methods for the list type

['append', 'count', 'extend', 'index', 
 'insert', 'pop', 'remove', 'reverse', sort']

For instructions about their specific usage, e.g. for reverse(), you can apply Python's built-in help system using the built-in help function. So, to get more information on the method reverse(), invoke help like this

help(list.reverse)

producing the output

    Help on method_descriptor:

reverse(...)
    L.reverse() -- reverse *IN PLACE*

For further information about Python objects, consult the following entry in Python tutorial.

Functions and arguments

Very often, you will find that you keep copying and repeating almost identical segments of your Python code. A common approach to avoid this redundancy, is to encapsulate these structures in a so-called function. By doing this, you keep your code readable, as well as concise and clear. Besides this, you also avoid reinventing the wheel every time your start on a new problem. Instead, you merely grab the function you made in a previous script.

We will use the above Newton iteration scheme as an example. To encapsulate this in a function, we could do the following

def newton():
    x = 20.0
    for i in range(10):
        x = x - (x*x - 2.0)/(2.0*x)
        print x

In the above, the def statement declares the begin of the definition of the function newton(). This also instructs the Python interpreter that all indented lines following the colon “:” belong to the function definition. Indention is very important: All lines belonging to the function must be indented by the same number of spaces within the region that defines function. We may then call the function, simply as

newton()

yielding

10.05
5.12450248756
2.75739213842
1.74135758045
1.44494338196
1.41454033013
1.41421360012
1.41421356237
1.41421356237
1.41421356237

Even though this already makes things easier to use, the function newton() still has certain short comings. For example, it would be nice, if we could

  • supply the initial guess (currently x = 20 is always used).

  • set the maximum number of iterations steps.

In Python, we do this by passing arguments to the function. Here is an implementation of the above wish list by passing arguments to the function newton()

def newton(n,x):
    for i in range(n):
        x = x - (x*x - 2.0)/(2.0*x)
        print x

newton(8,4.0)

giving

2.25
1.56944444444
1.42189036382
1.41423428594
1.41421356252
1.41421356237
1.41421356237
1.41421356237

The function now uses 8 iterations steps and the initial guess for the root is set to 4.0. Still, this is somewhat useless. Suppose that we actually wanted to use the result of the calculation (the numerical value of √2) in some subsequent parts of our script. We solve this problem by letting the function return the result of the calculation, which we then may “grab” and store in a new variable. Here is how

def newton(n,x):
    for i in range(n):
        x = x - (x*x - 2.0)/(2.0*x)
     
    return x

x = newton(8,4.0)
print 'sqrt(2) = ', x

giving the output

sqrt(2) =  1.41421356237

This is more satisfactory, but there are still some handy features regarding function definitions that can make life even nicer for us. Often we might be completely satisfied with using x = 2.0 and n = 8, when we call the newton() function. To avoid supplying this redundant information, we can define default values for the function arguments. We accomplish this by specifying arguments in the following way

def newton(n=10, x=2.0):
    for i in range(n):
        x = x - (x*x - 2.0)/(2.0*x)
 
    return x

If we are happy about the default settings, we may invoke the function by calling it as newton(). On the contrary, should the default settings be changed, we may also invoke the function by calling newton(8,2.0). When the variables for a Python function are specified like above, they are called optional variables as opposed to required variables which have no default value.

This is certainly handy, but what if we often wanted to change the initial guess for x whereas the value of n should be kept at the default setting? It is possible to override the default value by explicitly naming the variable like this:

newton(x=3.0)

which overrides the default value of x while keeping the default value of the argument n. This way of assigning values to variables makes it possible to specify the variables of the function in whichever order you prefer:

newton(x=2.0, n=30)

The above is a valid call to the newton() function. You may include both optional and required variables when calling a function. In this case, however, the order is important! Once you have specified your first variable by name, no more variables may be specified according to order.

A somewhat more elaborate example is the assignment of the return value from a NanoLanguage function specified using variable names and order shown below:

my_basis = basisSetParameters(
    SingleZeta,
    0.006*Bohr,
    charge=0.9,
    element=Lithium
    ) 

In the above line the function basisSetParameters() with a number of arguments. If you examine the page describing basisSetParameters() in the reference manual of NanoLanguage, you will find that the following optional parameters are specified:

dictionary basisSetParameters(
    type,
    radial_sampling_dr,
    energy_shift,
    delta_rinn,
    v0,
    charge,
    split_norm,
    element
    )

which all are optional parameters with a default value. In our example above, the argument SingleZeta is assigned to the optional parameter type because that is the first parameter belonging to basisSetParameters(). Remember, that if you do not specify parameters by name they get a value assigned according to order. The second argument, 0.006*Bohr is assigned to the parameter radial_sampling_dr because that is the second variable in the line and we are still specifying arguments by order. The value 0.9 is assigned to the argument charge by name. This way all the parameters of basisSetParameters() between radial_sampling_dr and charge are assigned their default value. This also means that all further arguments to basisSetParameters() must be specified by name if we want to change their default values, like is done for element.

If you want to know more on specifying functions and their arguments, please see the function section of the Python tutorial or this on-line resource on optional and required arguments.

Using NumPy in NanoLanguage

The NumPy module is used throughout NanoLanguage to e.g. store values from analysis functions. It can be used to perform advanced mathematical operations at much faster pace than using Python lists. NumPy objects resembles lists but they contain a lot more functionality. We ship NanoLanguage with built-in NumPy support to easily facilitate its usage.

A few major differences between ordinary lists and a NumPy array can be seen from this short NanoLanguage script:

from numpy import array

a = array([1,2]) # a NumPy array
a = a+[3,4]
print a
[4, 6]          # program output
a = [1,2]       # an ordinary Python list
a = a+[3,4]
print a
[1, 2, 3, 4]    # program output

NumPy arrays can in many ways be regarded as matrices as they have much the same functionality:

a = array([[1,2],[3,4]])
a = a *[3,4]
print a
[[ 3  8]   # program output
 [ 9 16]]  # program output
print a.trace()
19         # program output
a.transpose()
print a
[[ 3  9]   # program output
 [ 8 16]]  # program output
print a.trace()
19         # program output

Note in the above that the values 8 and 9 changed place in the matrix, however, the trace of the matrix remains the same.

NumPy arrays may also be converted into lists

a = array([[1,2],[3,4]])
print a.tolist()
[[1, 2], [3, 4]]     #program output

There are many more possibilities using arrays from the NumPy module, plus it is usually faster than iterating through for-loops or using lists!

More information about the NumPy module can be found at the NumPy homepage or by using the dir() command on a NumPy array object. Details on how NumPy can be used for performance can be found on the Python speed/performance tips page.