Programming Basics Part 1

Introduction

This is the first installation of a series of programming sessions. Information presented in this session is taken from a variety of sources, including the authors' experiences in programming computers over several years.

Definition

For the purpose of these sessions, programming is defined as 'communicating with the machine to instruct the processor the steps to follow to perform a task'.

Languages

A computer processor doesn't understand our means of communication via a natural language, the processor understands a code based purely on numbers, called a binary code.

If we were to list a few natural languages, we could see:
Chinese
Egyptian
English
French
German
Greek
Indian
Italian
Japanese
Kenyan
Latin
Malaysian
Norwegian
Polish
Spanish
Swedish
Turkish
Zulu
to name a few. There are so many natural languages, and some are related, partially derived or influenced by other languages.

A programming language is a means of communication and there are a lot of them available. Some languages, just like natural ones are related, partially derived or influenced by other languages. A partial list of programming languages looks like:

A
Assembler
B
Basic
C
C++
Clipper
Cobol
Eiffel
HTML
Java
JavaScript
ObjectPAL
Pascal
Perl
Python
REXX
Scheme
Small Talk
SQL
Tcl/Tk
See if you can find some more languages by researching through the internet or at your local library. For those that are interested, research a specific language to see how it came about.

The languages are created for a variety of reasons, C was developed to answer a need at AT&T Bell (American Telegraph and Telephone) and followed on from the B language. C++ was derived from C at the same company to create the object-oriented features needed for another problem to be solved. Some were created to allow a new processor type to have a programming language, others began as an experiment at a university computer lab, became publically available and spread quickly into mainstream use.

In spite of the wide variety available and their differences, they also have a lot of common features and structures that are almost identical. Generally, languages are related to either C/C++ or Pascal through the syntax and structures they use.

A useful program accepts an input or reads data, processes the data and outputs or writes a result for the end user to see.

Pseudo Code

To assist the software engineering process, a near natural language was developed to allow programs to be developed on paper, without touching a machine (other than to type in a word processor), using a specific language or platform (such as OS/2) until the program has errors minimised by paper-based testing. The pseudo code is in use by professional programmers to allow testing to occur early in the writing cycle, where the cost of fixing a problem is minimal.

The discussion on Pseudo Code leads into several topics that are directly related to the coding of a program.

Keywords

Keywords are found in every language and our pseudo code is no exception. For example, how will we know that an input should come from the keyboard or from file? How about the output: screen or file? Keywords represent intrinsic functions within a language, and they cannot be used by the programmer for any other purpose.

For example, once a word has been used, like the word PRINT, the programmer can no longer use that as anything but a command to tell the program that you want something sent to the screen. So too we need a way to get input from the user or a file so INPUT becomes another keyword. Our list of keywords is:

AND, CASE, ELSE, ENDCASE, ENDFOR, ENDIF, ENDWHILE, FOR, IF, INPUT, OF, OR, NOT, PRINT, REPEAT, THEN, UNTIL, WHILE

Variables and Constants

Programming is based on algebraic operations. In simple algebra there are values replaced by alphabetical and symbolic devices to represent the real value. If you think about the formula for the circumference of a circle, we have an example of Variables and Constants:

"C = 2 * Pi * r" Where C is the circumference, r is the radius and (pi) is a symbol defined as 3.1412

C is the holder of the calculation result, and this fact makes C a 'Variable', the value of C is dependent upon the value of r, which is also a 'variable' (ie the value will vary depending on the size of the circle that we want to calculate the circumference of), 2 is a fixed value that cannot change. is a symbolic representation of the ratio of a circumference and the diameter of a circle (since the radius is half a diameter, then the 'constant' 2 is added to make a radius become a diameter). This fact makes 2 a 'Constant'.

To help conceptualise the Variables and Constants, imagine a glass sitting on a bench. The glass or container has an open top, and we decide to pour water into it. Until the glass was filled with water, it could be said to have nothing or a null value in it. We could just as easily have made the contents milk or coffee or tea, in fact, we could pour out the water and replace it with red cordial, and replace that with green tea then milk. This is the idea behind a variable. A container (or placeholder in memory) for holding a value that we can change by telling the computer to do so. In contrast, the Constant once filled with a value, holds that value until the program that created it is terminated, if you like, the glass has a sealed top after filling to prevent changing the contents.

Variables and Constants are given names in code to make then meaningful, early languages supported an algebra like mentality to programming and allowed one or two characters to name then, eg: A, A1 etc. Modern languages support the use of names of 40 characters or more in length, eg: InterestEarnedOnLargeAmounts is more meaningful than using I in the code.

So for our circumference example, we could write the formula as:

CircumferenceOfaCircle = 2 * Pi * radiusInputByUser

Types

One area not yet discussed are the use of types in variables and constants. In mathematics, there are two basic numerical types used: Integers and Real Numbers. The other type used in computing is the char or string type, which is used to display messages such as "Error". All other types, such as logical, are derivations from one the three basic types. Typing can be expanded to create a special type that is a combination of types, both basic and derived.

In a computer, any number, letter or just about anything that can be stored in a computer is treated as a code of Ones and Zeros (On or Off) organised into blocks of eight bits (Binary digIT) or a byte. Without types, then the computer has no way of telling if the number is an integer, a character (string) or has some other significant meaning. Like the Variable example above, the glass could be described to be a container that holds type liquid, whereas a container of a different type such as a wheelbarrow may only hold type solids.

When a type is defined, then the bundles of ones and zeros become a meaningful item. For example, the number 65 could represent either the integer 65 or the letter 'A' purely dependent upon its type. If the variable or constant where the 65 was stored was of type integer then we have a number, conversely, if the type was char (or string) then it would appear on the screen as the letter 'A'.

Operators

Operators are symbols used to indicate an operation, usually mathematical or logical. Operators include plus (+), minus (-),multiplication (*), divide(/), equals (=), logical AND, OR and NOT. They are used to tell the program to perform the operation, test for a logical condition or value test operands (operand is the item we are operating on eg A = B + C A is the result and B and C are the operands of an algebraic addition operation).

Expressions

Expressions are combinations of operators (ie: +,-,×,÷,,= etc), variables and constants to make decisions, or process data in the specified manner and assign results to holders (variables) for future use. The formula could be turned into an expression like:

2 × PI × R

Notice the use of the multiply where the original formula didn't have them. In algebra, the missing multiply symbols are assumed. Programming languages require they be inserted, otherwise 2PIR might be interpreted as a single variable or constant. Also, we have not given the answer a place to go. This expression would equate to zero if R was assigned the value 0.

If we needed to keep the circumference for another expression, we would need to assign the result to a variable to use later. Thus:

C = 2 × PI × R

is used in a program to calculate and hold the answer for later use. This type of expression is an assignment expression. Assignments consist of a Left Hand Side (LHS) and a Right Hand Side (RHS) with the = (equals) operator between.

The other type of expression is the comparison expression. Instead of assigning a value from the RHS to the LHS, we compare the values of Left and Right sides using operators like <, =, >= to tell us if a condition has been met or not. The effect is that the program can make decisions based on criteria set before the code is written. For example, determine if the circumference exceeds a predetermined value of 100.

C > 100

makes the comparison and the result is a value known as 'true' if C is greater than 100, otherwise the result returned is called 'false'. In this form, the expression has no immediate use, in Statements and Structures, the use of these expressions become much clearer. The use of the '=' operator in a comparison needs clarification.

The expression C=100 could mean assign the value of 100 into the container C, or could ask the question is the value of C equal to 100? Essentially, we could agree in a statement like IF or CASE, then its comparison, otherwise its means assignment from RHS to LHS. The alternative is to use different symbols for assignment and comparison, that is '=' means assignment and '=='(double =) means comparison.

Statements

Statements consist of combinations of keywords and expressions and generally occupy one line of code. Complex structures written in that manner obscure the programs operations, and make reading the code difficult to read (not to mention straining both the eyes and minds of the readers).

In the circumference of a circle example, the assignment expression is also a complete statement. Where a comparison statement is used, its true use is in deciding what the next action will be. In the next section on structures, the use of the comparison expressions will become clearer.

Structures

There are two categories that program structures fall into: Decisions (conditional branches) based on set criteria, and looping or reiteration. With these structures (common to all programming languages) the programmer can set about analysing a real task and producing usable code for the computer. These structures constitute one statement, even though they are spread over several lines and contain statements within them.

Conditional Branches

There are two constructs that denote branching decisions, one is called the IF THEN (ELSE) and the other is a CASE or SWITCH. Both of these utilise the comparison expression to determine the next action.

The IF THEN structure has two forms:

To read the IF structure in (a) above reads:

IF <expression> is TRUE, THEN perform <statement>

and in (b) read:

IF <expression> is TRUE, THEN perform <true statement> ELSE perform <false statement>

The IF statements above imply that only one statement can be selected for execution. The addition of a precise and unambiguous way to close the statement is required. Thus:

As you can now see, the ability to decide which blocks of code to execute is made available to the programmer. There is no real limit to the number of <statement> that may be placed inside the structure (dependent upon the programming language used), but the lines are long and the code is harder to read (readability).

To assist with the coding, the IF structure may be broken into separated lines to make it more friendly to read, thus:

IF <expression> THEN

<statement>

<statement>

ENDIF

Clearly shows at a glance the codes intention and control of the blocks. This stye uses returns, spaces and tabs to 'indent' the code and is often called indenting. The use of 'whitespace' (returns, tabs and spaces) improves human readability of the code.

So in the case of our Circumference example, the code for determining that circumference is too big:

IF C>100 THEN PRINT "Sorry, too big" ENDIF

could be the statement needed to provide feedback about the decision made.

We are not restricted in the IF..THEN..ENDIF to just one, we may also nest the statements to allow more complex decisions to be made, so:

IF <expression> THEN

<statement>

IF <expression> THEN

<statement>

ENDIF

ELSE

IF <expression> THEN

<statement>

<statement>

ELSE

<statement>

ENDIF

<statement>

ENDIF

Is one potential way to use the decision structure for our program. The use of the indenting makes the intent of the decision very clear and the eye can skip down a page easily picking the purpose without a great deal of effort.

The CASE statement is an alternative to nested or repeated IF statements:

CASE <variable> OF

<expression> : <statement>

<expression> : <statement>

...

ELSE

<statement>

ENDCASE

CASE structures lend themselves well to a menu item decision or where the use of CASE would simplify a nested or repetitive IF statement is used. For example:

CASE BankBalance OF

< 100 : chargeAccountKeepingFee

>500 : addInterest

>2000: AddMoreInterest

ENDCASE

The Left and Right Sides are handled by the CASE OF statement, making BankBalance the LHS and the expression <100 the RHS of the full comparison expression

Iterations

There are three loops in use in programs, looping utilises code for a number of iterations. One loop is used when a known number of loops are required, and the other two are used when the exact number of iterations cannot be determined before starting. Thus

FOR <variable> from <start constant> TO <finish constant>

<statement>

...

ENDFOR

and

FOR <variable> from <start constant> TO <finish constant> STEP <step constant>

<statement>

...

ENDFOR

are called a FOR LOOPS and allow <variable> to take the value of <start constant>, adding a minimum of one to <variable> (also called 'incrementing') until <variable> is equal to <finish constant>. The following example of a FOR LOOP shows how the structure works:

FOR Counter FROM 1 TO 7

PRINT Counter

ENDFOR

Image removed

The displayed output would of course be the value of Counter at each pass. The variable Counter must be of type integer for the loop to work. Had we used the optional STEP statement, we could have gone from 1 to seven in steps of 2 or 3 ie:

FOR Counter FROM 1 TO 7 STEP 2

PRINT Counter

ENDFOR

Just like the IF statement, we have made the close of the loop unambiguous with an ENDFOR keyword or token.

The next two forms of iterations are the WHILE <expression> ... ENDWHILE and REPEAT ... UNTIL <expression>. You notice that both have an expression to test for a condition, WHILE tests before any statements and REPEAT tests after running its statements. eg

C = 99

r = 0

WHILE C < 100

r = r + 1

C = 2*PI*r

PRINT "A circle of Radius ",r," has a circumference of ",C

ENDWHILE

Notice the assignments of the variables C and r before entering the while statement. If the variables were left in an unknown condition before entering, it is possible that C might have been set to 105 after the memory was allocated for the program. Under that condition, the expression would have evaluated to false and the statements inside the WHILE loop would not have run.

WHILE loops are used where the code does not have to be run unless the condition is met.

Had the same code been in a REPEAT loop, the enclosed block would have run once before the conditional test, if the condition is not met then it would re-run the block until the condition is met.

r = 0

REPEAT

r = r + 1

C = 2*PI*r 2*PI*r

PRINT "A circle of Radius ",r," has a circumference of ",C

UNTIL C > 100

REPEAT loops are used where the block must run at least once before the logical test condition is checked.

SENDING AND RECEIVING

To allow us to interact with a user, the pseudo code must be able to send and receive messages. Without this small aspect, a program cannot interact with the user.

PRINT <expression>

sends the expression (either a message like "Enter your age" or the result of a calculation (4+5) or a variable or constant value) to the screen. Eg:

PRINT "Please enter your age"

sends the bit in the inverted commas to the screen. This message is actually a 'prompt' to the user what the program needs from them. The PRINT command allows a programmer to prompt users, provide feedback and give results. Eg:

PRINT CircumferenceOfaCircle

Would send the value of the variable CircumferenceOfaCircle (ie the contents of the container) to the screen. To make the answer more meaningful to the user, the programmer could have used combinations of messages and variables:

PRINT "The Circumference of a circle with radius "; radiusInputByUser; " is "; CircumferenceOfaCircle; "."

which would send

The Circumference of a circle with radius 10 is 62.83.

to the screen for the user to see.

Finally INPUT only works with variables (constants by their nature will not accept changes to their values) so:

INPUT radiusInputByUser

would take the value from the user at the keyboard and put it into the variable.

Exercises

Using the material presented, write on paper, pseudo code to

  1. Decide if a persons age is over 18.

  2. Count from 1 to 10 and display the count on the screen
  3. Perform a conditional test before running the count program. (Hint: while you are thinking, read the section on loops)
  4. Perform a conditional test after running the count program at least