CSCI1300 Notes 2: Information Holders

Purpose: (1) Make sure you can accurately envision what is happening when a program moves information around in memory. Note that this is mostly about understanding what programs do, not about writing them. (2) Be able to write and run a simple program.

Working effectively as a team.

Find one or more classmates to work with you on this material. Teammates should help one another understand the concepts, and help one another fiddle around with the software to make it work. A team can be as few as two people or as many as four… more than four will be unwieldy.

You and your teammates should work through the material together, making sure everyone is satisfying the requirements. It won't work for team members who may have more background to just "do the work for" other team members who need to develop knowledge, understanding, and skill.

If you encounter ideas or problems no-one on your team can help with, be sure to bring these up in class or office hours. Course staff are here to help everyone master this material, and you WILL be able to master it. Don't be put off by the appearance of difficulty at first... if you are patient you will be pleasantly suprised at how quickly you will start to understand things that are gibberish at first.

Each team member will be keeping track of the time spent working on this assignment, for their individual logs.

This assignment has some requirements you'll need to work on at the computer, in the lab and/or in someone's room, or on someone's laptop, and some requirements that call for paper and pencil or blackboard work. Plan out your overall use of time, when you'll work together in the lab, etc.

To check everyone's mastery of concepts, it's a very good idea for team members to make up questions for one another. Don't settle for just working out the specific examples included in the notes. And don't forget to draw pictures of the information holders you are working with, as explained below!

Background

Memory and information holders. As you’ve seen, computer memory is made up of a large number of electronic gizmos called bits, each of which can hold a zero or a one.  Groups of bits are used to contain numbers and other information in the computer. We'll call a group of bits that's used to hold information an information holder. A more commonly used, but not as clear, term is variable.

The bits in an information holder are used in different ways depending on the form of information to be stored. Thus the bits are used differently to hold whole numbers, like 3 or -23, from the way they are used to hold numbers with decimal parts, like 3.4 or -124.675. They are used differently again to hold characters, like 'a' or '%'.

Declarations. You put declarations in your programs to tell the computing system that you want to create information holders for particular kinds of information and give them names. The system (actually, the part of the system called the compiler) keeps track of where in memory your information holder will be, what kind of information you want the bits to represent, and how many bits you want.

The declaration

int foo;

creates an information holder called foo that will be used to hold whole numbers. An int information holder is typically made up of 16 or 32 bits.

The declaration

float bar;

creates an information holder called bar that will be used to hold numbers with decimal parts, like 3.14. A float information holder is typically made up of 32 bits. These bits are used in such a way as to be able to hold numbers over a very large range, from about -1038 to 1038 and as small in magnitude as 10-38, but only approximately. With only 32 bits to work with you can't possibly represent that many different numbers exactly.

Kinds of information holders, like int and float, are called data types.

You can't get around having declarations in C and similar languages. Every kind of information is represented with bits, and so there's no way to tell what information you've got just by looking at the bits. The declaration is essential to indicate the intended interpretation of a given batch of bits.

Arrays. Very commonly one wants a group of many information holders of the same kind, with a convenient way to refer to the different ones. Such a group is called an array. The declaration

int baz[100];

tells the compiler you want 100 information holders, each capable of holding an int. The name baz refers to this whole collection. The number of information holders in an array, 100 in this case, is called its size, or its length, or sometimes its dimension.

Subscripts. To refer to one of the individual information holders in an array you use numbers called subscripts, for historical reasons, or indices. Thus baz[37] refers to the particular information holder in baz whose subscript is 37.

The individual information holders in an array are given subscripts running from 0 to n-1, where n is the size of the array. Thus the individual information holders in baz run from baz[0] to baz[99]. There is no baz[100]!

Initial values. One way to put information into an information holder is to write it in the declaration. The declaration

int foo=3;

creates an int information holder called foo and puts the value 3 into it. A value specified in this way is called an initial value.

The declaration

int baf[3]={1,2,3};

creates an array of 3 int information holders and puts the values 1, 2, and 3 into them in order.

Reading from the keyboard. Another way to put a value in an information holder is to read in a value from the keyboard, that is, to let the user type in a number when the program is running that will go into the information holder. The statement

scanf("%d",&foo);

will wait for the user to type in a whole number and press the return key before the program proceeds. The number typed in will be put into foo.

Notice the & in front of foo... it's got to be there, but don't worry yet about understanding what it means.

If foo were a float information holder, rather than an int, you'd say

scanf("%f",&foo);

instead. The "%d" and "%f" indicate whether you are reading a decimal whole number or a float.

Anything you can do with a simple information holder like foo you can also do with any of the individual information holders in an array. For example,

scanf("%d",&baf[0]);

will read a value into the first information holder in the array baf (remember that the individual info holders in an array are numbered starting with 0.)

Assignment statements. You can also put a value into an information holder using an assignment. The simplest form of an assignment is something like

baf[1]=foo;

This makes a copy of the information in foo and puts it into baf[1]. The assignment

foo=baf[37];

copies the value from baf[37] into foo. The assignment

baf[3]=baf[5];

copies the value from baf[5] into baf[3], and so on.

Note that the = sign as used in an assignment has almost nothing to do with the ordinary mathematical = sign. It’s true that after you do an assignment like

foo=bar;

the values of foo and bar will be the same (for the moment), but foo=bar; and bar=foo; have very different meanings, whereas for the mathematical symbol = they mean the same thing.

An assignment does not alter the information in the information holder on the right. In particular it does not remove the information that is there. That's why you want to think about assignment as making a copy of the information. If foo contains 37, and we do the assignment

baf[2]=foo;

foo will still contain 37 afterwards.

An assignment replaces whatever information was in the information holder on the left side. So in the example just above, whatever information used to be in baf[2] is replaced by 37.

Assignments don't have to have information holders to the right of the = sign, though these examples all do. Instead they can have expressions, such as foo+blee, and lots of other forms we'll see later.

Conversion. If you use an assignment to copy the value of an information holder of one type into an information holder of a different type, the value has to be converted. Depending on the types involved, conversion may or may not be possible. If it is possible, it may give unexpected results.

When an int value is converted to a float, the resulting value is only a good approximation to the exact whole number the int contained. Usually this is harmless.

When a float value is converted to an int, it is truncated. That is, the decimal part of the float value is discarded. Thus if foo is an int and bar is a float containing the value 34.67, the assignment

foo=bar;

will result in foo containing 34. Notice that truncation is NOT rounding, which might be what you would expect.

Literals. Assignment statements can have numbers, called literals, on the right hand side:

foo=16;

puts 16 into foo. If does not make sense to put literals on the left side of an assignment, as in

16=foo;

That's because 16 is a number, not an information holder, so you can't put anything into it.

Using information holders as subscripts. As you've already seen, you can use literals as subscripts, as in baf[37] or baf[0]. But you can also use information holders as subscripts, as in baf[foo]. The individual information holder you get is based on what value is in foo. For example, if foo now contains the value 37, then baf[foo] is the same as baf[37]. If foo contains the value 0, then baf[foo] is the same as baf[0].

Here's an example illustrating this last point. Notice that I'm including the declarations up front. Be sure you can read and follow each step:

int a; //This gives us an info holder for ints called a.

int b[10]; //This gives us an array of 10 info holders for ints

            //called b[0], b[1], ..., b[9].

a=2; //This puts 2 into a.

b[a]=3; //Since a contains 2, this puts 3 into b[2].

a=b[a]; // Since a contains 2, b[a] is b[2]. We know that b[2]

            //contains 3. So this puts 3 into a.

b[a]=23; //Now a contains 3, so b[a] is b[3]. So this puts 23

            //into b[3].

a=b[a]; //a still contains 3, so b[a] is b[3], which contains 23.

      //So this puts 23 into a. Notice that the value of a

      //we use to figure out what part of b we are using is

      //the value of a before the assignment happens.

Our story so far is intelligible, though perhaps not very dramatic. Let's add some excitement by making some common mistakes.

 a=b[a]; //Now a contains 23, so this will take the contents

      //of b[23] and copy them into a. But there is no

      //information holder b[23]! What will happen is

      //that some random value, whatever the bits are that are

      // sitting where b[23] would be if there were one, will be

      // copied into a.

b[a]=13; //Now we are really in trouble. We have no idea

      //what value is in a, so we are trying to access some

      //piece of the array b that may exist (if the value of

      //a happens to be between 0 and 9) or may not exist

      //(if the value of a is something else, which it

      //almost certainly is.) If b[a] does not exist what will

      //happen is that 13 will be copied into some

      //random part of memory, with potentially

      //disastrous results.

There's an important moral for this last piece of the story. When you program in C nobody checks the subscripts you use to be sure they make sense. Using subscripts outside the valid range for your array is a very common, and often very tricky, program error. You have to learn to guard against this (which may be especially hard if you have been working in a language like BASIC that checks this for you.)

Pictures of information holders. Programmers often draw pictures to help keep track of what values are in what information holders. Such a picture just has a little box for each information holder, with arrays being drawn as columns of boxes next to each other. The values are written into the boxes, and erased and replaced as needed when the values change.

You will find it useful to use this kind of picture while you work on the exercise. Even very experienced people do this, because otherwise its just too easy to forget what is where... this is not just a crutch for beginners that you can ignore. Learn to do it!

Seeing what's in an information holder when a program runs. Very often you want a program to get some information into an information holder and then show what the information is. If a is an int information holder, the statement

printf("%d", a);

will examine the value that is in a and display it on the computer screen. In a way this is the inverse of the somewhat similar-looking statement

scanf("%d",&a);

This first statement takes the value in a and gives it to the outside world (this is called doing output), while the second statement gets a value from the outside world, via the keyboard, (called doing input).

You can display the values of as many information holders as you want, all in one statement, as in

printf("%d %d %d", a,b,c);

which will display the values of the information holders a, b, and c in that order.

You can also display any text on the screen, by enclosing the characters you want to display in quotes, as in

printf("Here is a little message");

You can mix in printouts of the values of information holders into a message, like this:

printf("The value of a is %d and the value of b is %d",a,b);

The two %d's inside the quotes get replaced by the values of a and b, in order, when this statement runs.

If you just wrote

printf("%d%d",a,b);

You'd get a display of the values of a and b, but the values would be smashed together, since you didn't ask for anything to be displayed in between, and they would not be labelled.

Terminology: that string in quotes in a printf() or scanf() statement is called a format string.

One last point: when a program displays a lot of information it fills up more than line on the screen, and the display can be hard to read. To control this you can put a special item, \n, in your format string that says to go to a new line, like this:

printf("The account balance is %d\n",bal);

After displaying the text and the value of bal the system will move on to the next line of the display, so that the next information to be displayed will not go on the same line with this information.

What the computer understands. We can make a beginning here on an important matter that may take you some time to grasp: how little the computer understands about your program. For the computer your program is just a series of detailed instructions that it must carry out. It has no idea what your program is about, or any knowledge of the world that would help it to understand what you are trying to do.

Here's an example that illustrates this:

printf("The value of a is %d and the value of b is %d\n",b,a);

A human being may be able to see that this piece of program is probably wrong, and that the display produced will be very misleading. But the computer has no such idea. It will simply do what is asked. To it, a sequence of characters in quotes like "The value of a is" has absolutely no meaning, other than specifying characters that the statement says need to be drawn on the screen.

You'll learn that the computer will always do exactly what you ask it to do, whether that makes any sense or not, as long as you've followed the rules of the C language.

Another aspect of this has to do with the names you provide when you declare your information holders. The computer only uses these names to identify what information holder you are referring to. Calling an information holder x, total, average, or eggplant will make absolutely no difference to how the computer treats it.

This does not mean that it does not matter what names you use for your information holders. It is true that the names have no effect on what your programs do, but they make a huge difference in how easily you and other human beings can understand your programs, which as you'll learn is an extremely important matter.

Grouping data items together. One of the keys to creating a clear and correct program is organizing all the data the program manipulates in a logical way. You've seen, for example, that arrays are a useful way to organize data when you have many data items of the same kind that should be processed in similar ways.

Another common situation is that you have groups of data that are of different kinds, or need to be processed differently,  that nevertheless belong together. For example, if a program is managing information about (say) employees, there will be many data items used to describe each employee, such as name, address, salary, and so on. Because these items all describe one person, you want to keep them together. But because they are different kinds of data, you can't use an array. Even if the items were all of the same kind, an array would not be a convenient way to store them, because we'd have to use subscripts to access the different individual items. For example, if the array were called (say) employee_stuff, we'd have to remember that employee_stuff[0] is the name, and that employee_stuff[1] is the address, and so forth. We could live with this if we had to, but it's not very natural.

C provides another way to organize data that works well in cases like this. We can create information holders called structs that contain other information holders, called members, within them. The members can be of any type, and each member has a name that is used to refer to it.

Creating structs is a little more complicated than creating arrays, the way I recommend you do it. Rather than just declaring an information holder to be a struct, you first should define a struct type. This is a new data type, like int or float, that you get to define. You tell the system what you want to call the new struct type, what members you want it to have, and what names you want to use for each member. Then you can declare as many information holders as you want to have that new struct type. You can even declare arrays of your new type, or define other struct types whose members are of your new type.

Here's an example of how this works. Suppose your program needs to work with descriptions of widgets, each of which is described by three numbers, a serial number, a weight, and a height. You can define a new type for managing this information:

struct widget_description

{

      int serial_number;

      float weight;

      float height;

};

Having defined this new type you can use it to declare information holders, just as you can use any other type:

struct widget_description foo;

This creates an information holder foo which has three information holders inside it, an int and two floats. The names of the members in the definition of widget_description are used to refer to the part of foo, like this:

foo.serial_number

foo.weight

and so on. If you declare another information holder

widget_description bar;

you refer to its parts as bar.serial_number, bar.weight, and bar.height.

If you needed an array full of descriptions of widgets you could declare one like this: 

struct widget_description widget_inventory[100];

That creates an array called widget_inventory that contains 100 widget_descriptions, each of which has a name, a weight, and a height. You can refer to any of these pieces of information like this:

widget_inventory[37].serial_number

is the serial number in the 38th widget description in the array widget_inventory.

Assigning structs. A really convenient feature of structs is that you can assign all of the parts at once in an assignment statement, as in

widget_inventory[75]=bar;

This copies the values of all the parts of bar into the corresponding parts of the element of widget_inventory with subscript 75.

Pictures of structs. Draw a struct value as a box with little boxes inside it, each with its member name. Remember, real programmers do this, and so should you!

Exercises

Exercise 2-1.

Make sure all team members can  type in, compile, and run a simple program, like this, on the computers in the lab, or on your own computer:

#include <stdio.h>

int main()

{

     printf("eggplant\n");

     return 0;

}

Here are the steps. Get help anytime you need it!

First, create a folder named after you, inside the cs1300 folder. Then:

(a) Use Notepad to type in the above text (or copy it into Notepad from this Web page.) Save the file using a name like “okra.c” with type “all files”.  Put it in the folder named after you.

(b) Get a command prompt (start->programs->accessories->command prompt).

(c) Type the command cd \cs1300 to enter the cs1300 folder.

(d) Type the command gocs

(e)Type the command cd yourname, where for “your name” you put the name of the folder named after you.

(f) Type the command gcc okra.c –o e putting in your program name in place of okra.c

(g) Type the command e. Your program should run!

Exercise 2-2.

(a) Modify the above program so that it puts 3 into an int variable called carrot, and then prints the value of carrot. (b) Then declare another int variable called cabbage, but do not put a value in it. Add a statement to the program to print the value of cabbage. What result do you get?

Exercise 2-3.

Type in, compile, and run this program. What does it print? Why do you think that is?

Note: There is something wrong with this program, and what happens when you run it depends on choices the compiler makes. If you are "lucky", you'll see an interesting symptom when you run the program. If you are "unlucky", nothing interesting may happen. If you are "unlucky", what the program prints will be normal... but determine what's wrong with the program anyway.

#include <stdio.h>

int main()

{

     int a;

     int b[2];

     int c;

     a=1;

     c=1;

     b[2]=13;

     printf("a contains %d and c contains %d\n",a,c);

}

Exercise 2-4.

Here the idea is for you to develop your ability to envision in advance what will happen when your program runs, rather than being dependent on trial and error.

So answer these questions WITHOUT creating and running a program. Make sure each team member can answer all of these questions, and others like them, reliably: there will be a Language Check on this material. Include your answers in your submission for this assignment. Also be sure every team member can draw box pictures of examples like these.

If the following program fragments were run, what values would be left in the information holders a and b?

Problem 2-4a

int a,b; //this is a shorthand way of declaring two information holders of the      //same type at once.

a=1;

b=2;

a=b;

b=a;

 Problem 2.4b

int a,b,c;

a=1;

b=2;

c=a;

a=b;

b=c;

 Problem 2-4c

int a;

float b;

b=3.78;

a=b;

b=a;

Problem 2-4d

float a,b;

float c[10];

c[2]=3.16;

c[5]=4.0;

a=c[2];

b=c[5];

 

Problem 2-4e

int a,b;

int c[10];

a=0;

b=0;

c[a]=2;

c[2]=3;

b=c[c[b]];

Problem 2-4f

int a,b;

int c[10];

a=2;

b=3;

c[a]=b;

c[b]=a;

a=c[c[a]];

b=c[c[b]];

Problem 2-4g

int a,b;

int c[10];

a=12;

b=c[a];

Problem 2-4h

int a,b;

int c[10];

a=0;

b=3;

c[a]=2;

c[b]=0;

c[c[b]]=5;

a=c[a];

Problem 2-4i

int a,b;

struct point

{

int x;

int y;

};

struct line

{

struct point beg;

struct point end;

};

struct shape

{

struct line lines[100];

int n;

};

struct shape square;

struct point southwest,northwest,northeast, southeast;

struct line west,north,east,south;

southwest.x=0;

southwest.y=0;

northwest.x=0;

northwest.y=1;

northeast.x=1;

northeast.y=1;

southeast.x=1;

southeast.y=0;

west.beg=southwest;

west.end=northwest;

north.beg=northwest;

north.end=northeast;

east.beg=northeast;

east.end=southeast;

south.beg=southeast;

south.end=southwest;

square.lines[0]=west;

square.lines[1]=north;

square.lines[2]=east;

square.lines[3]=south;

square.n=4;

a=square.lines[2].beg.x;

b=square.lines[3].end.y;

Problem 2-4j: What difference would it make if your replaced the word "square" by "triangle" everywhere in the fragment in 2-4i?

Problem 2-4k Does the code in Problem 2-4i illustrate compositionality? How?

What to turn in

Describe your team's progress in completing each of the problems, including any information that a problem may call for, such as answers to questions, or listings of programs. Also include any questions any team members have about the material, or any difficulties you encountered.

You should prepare your answers together, but each team member should turn in his or her own report, including your time report, as described in the syllabus.