Python is a scripting language and general purpose programming language created by Guido van Rossum in the late 1980s (see http://en.wikipedia.org/wiki/Python_(programming_language). It has become very popular for a wide range of applications, often providing the framework for gluing powerful C-based algorithms to GUI interfaces (see http://docs.python.org/release/2.5.2/ext/intro.html making complete graphics productions programs by using tcl/tk (a rapid prototyping scripting language, http://en.wikipedia.org/wiki/Tcl) via an interface package called Tkinter (http://wiki.python.org/moin/TkInter)
This is a brief introduction to and summary of python statements. Python comes in several different versions. Currently, versions 2.4, 2.5, 2.6 and 2.7 are the most commonly used. This introduction will be based on version 2.7. Some portions of the language have been glossed over or omitted here. The newer, but incompatible Python 3 (http://wiki.python.org/moin/Python2orPython3) is not discussed.
For more detail on python syntax, see http://docs.python.org/reference/index.html.
A python program may be created a line at a time, interactively, or as a text document.
When we write in most languages, we use linear sequences of characters or glyphs, some of which are group together to make the words of the language and some of which are used as punctuation to help us organize those words. The words and punctuation are called "tokens".
In python, (as well as in C, C++, java and JavaScript) the characters used are drawn from a limited set of characters, that must, at a minimum include representations of
Python permits many more characters than this, allowing all the characters of the 7-bit ASCII character set for the programs themselves and can handle text string drawn from the entire 8-bit multibyte Unicode character set called UTF-8 (see www.unicode.org).
Multiple "physical lines" may be joined together by ending a line with an unquoted reverse solidus ("\")
Lines are scanned for "tokens".
In Python the spacing at the beginning of a line is, in and of itself, a significant token. Line indentation is used to organize multiple lines of statements into blocks in much the same way that "{" and "}" are used in C, C++, java and JavaScript.
The remaining tokens are organized into keywords, identifiers, literals ( what we would call strings and constants in other languages), operators and delimiters.
The keywords are the reserved words of the language. The words that are used as keywords or otherwise reserved in Python are: and, as, assert, break, class, continue, def, del, elif, else, except, exec, finally, for, not, or, pass, print, raise, return, try, while, with, yield, and None. To help ensure protablility of code to C, C++, java or JavaScript, the following words should also not be used for used-defined identifiers. abstract, asm, auto, boolean, byte, case, catch, char, const, default, delete do, double, entry, enum, export, extends, extern, false, final, float, for, friend, function, goto, implements, in, inline, instanceof, int, interface, long, native, new, null, operator, package, private, protected, public, register, short, signed, sizeof, static, strictfp, struct, super, switch, synchronized, template, this, throw, throws, transient, true, typedef, typeof, union, unsigned, var, virtual, void, volatile,
Identifiers are sequences of letters, digits and underscores. The first character of an identifier must be a letter or an underscore. Only the letters from A-Z and a-z are permitted (no accented letters, no currency symbols as in some other languages). The Python interpreter reserves the special identifier consisting of a single underscore to return the result of its last evaluation. There are many other special uses of identifiers beginning with the underscore, so it is not a good idea to start a user-defined identifier with the underscore. In particular all identifiers beginning and ending with two underscores are reserved for system-defined names. The keywords and reserved words should not be used as identifiers.
Each portion of a Python program consists of a sequence of statements. A statement may be a small statement, a simple statement or a compound statement. A simple statement is a series of one or more small statements separated by semicolons and all formatted on a single line. The last semicolon on the line is optional and is not normally used. A compound statement extends over multiple lines beginning with lines that establish the purpose of the compound statement followed by indented lines that do the work of the statement.
A small expression may by any of the following:
A compound statement may be any of the following:
Python has many operators. They are similar to, but in some ways distinct from the C, C++ and Java operators. See http://www.tutorialspoint.com/python/python_basic_operators.htm The operators in Python are:
target_list = expression_list
which takes a comma separated list of one or more expressions, evaluates them and stores the result into a comma separates list of one or more target variable identifiers. The entire right-hand side (rhs) of the assignment expression is evaluated before anything is stores into the variable identifiers on the left-hand side. Very complex things can happen when the variable identifiers refer to classes.
The equals sign can be "augmented" with an operator to be applied to combine the variable identifiers on the left-hand side with the expressions on the right-hand side before storing back into the variable identifiers on the left-hand side.
target_list += expression_list target_list -= expression_list target_list *= expression_list target_list /= expression_list target_list //= expression_list target_list %= expression_list target_list **= expression_list target_list >>= expression_list target_list <<= expression_list target_list &= expression_list target_list ^= expression_list target_list |= expression_list
print expression print expression , expression ... print >>: file_expression , expression ...
When the print statement ends in a comma, the next print statement will continue on the same line. Otherwise each print statement ends with a newline character. If the print statement has no expression and no comma, it just generates an empty line.
del identifier
pass
break
continue
return return expression
raise return expression
yield expression
which brings in all the identifiers. It is possible to specify what is to be imported in great detail.
global identifier
exec expression
assert expression
if expression ":" statement to execute if true statement to execute if true ... statement to execute if true elif expression ":" statement to execute if true statement to execute if true ... statement to execute if true ... else":" statement to execute if false statement to execute if false ... statement to execute if false
while expression ":" statement to execute each time statement to execute each time ... statement to execute each time else ":" statement to once after the loop statement to once after the loop ... statement to once after the loop
for identifier_list in expression_list ":" statement to execute on each expression statement to execute on each expression ... statement to execute on each expression else ":" statement to once after the loop statement to once after the loop ... statement to once after the loopThis is different for the C for-loop, but a similar effect can be achieved by using the range function to generate an expression list that is a sequence of numbers.
try : statement to try statement to try ... statement to try except expression_specifying_the_error : statement to handle this error case statement to handle this error case ... statement to handle this error case ... except : statement to handle any remaining error case statement to handle any remaining error case ... statement to handle any remaining error case finally : statement to execute in all cases statement to execute in all cases ... statement to handle execute in all cases
with expression as identifier : statement statement ... statement
You may insert annotation to help the reader of python program, comments that will be ignored by the Python system by using the hash character ("#") or the treble quote mark ("""). The hash character causes al characters from that point onwards on a given line to be ignored. Normal parsing resumes on the next line.
The treble quote mark is used to delimit multiline strings, but when a treble-quote-mark delimited string appears as the first indented statement of a function, class or module, it is treated as a documentation string, or "docstring" intended to provide a descriptive comment.
See http://en.wikibooks.org/wiki/Python_Programming/Source_Documentation_and_Comments
A function is a named set of statements to be executed when that name is "called". When a function is made part of an object it is called a method.
A function definition consists of the keyword "def" followed by the name of the function, optionally followed by a parenthsized argument list then a colon and then and then the indented statements defining the what the function does.
def identifier ( formal_argument_list ) : function_statement function_statement ... function_statement
If a method does not have formal arguments, the identifier is followed either by an empty pair of parentheses followed by the colon, or, equivalently, directly by the colon without even the empty parentheses.
A class definition consists of the keyword "class" followed by the name of the class, then a colon and then and then the indented statement defining the class, usually function definitions for the functions that are members of the class. A class may inherit from other classes by following the name of the class with a parenthesized, comma separated list of base classes from which this class will inherit function definitions and variables.
class identifier : class_statement class_statement ... class_statement
Decorators on function and class definitions are patterns that allow the class or function that follows to be modified to conform to that pattern. Another term for this is a "macro". See http://www.artima.com/weblogs/viewpost.jsp?thread=240808 for an explanation.
Prepared by Herbert J. Bernstein
30 August 2010.
© Copyright 2010 Herbert J. Bernstein. All Rights Reserved.