| Computer Engineering | |
| Index of articles in the Computer Engineering Curriculum | |
| Prereqs | |
| *Science prereqs | |
| *Calc I - derivatives and intergrals | |
| * Electrostatics | |
| 100 level | |
| *Intro to computer engineering | |
| *Intro to programming | |
| *Intro to electricty | |
| *Calc II - limits and series | |
| 200 level | |
| *Linear circuits | |
| *Intro to digital logic | |
| *Intro to Object Oriented Programming with Java | |
| 300 level | |
| *Computer architecture | |
| *Intro to electronic devices | |
| *Programming in C and C++ | |
| 400 level | |
| *Embedded systems | |
| *Networks | |
| *Programming Data Structures and Algorithms | |
| *Signal processing | |
| Electives | |
| *Additional topics in computer programming | |
| This article is part of a series of articles intending to offer a curriculum of Computer Engineering. For information, please see Category:Computer engineering curriculum. |
Often in programming, we need to deal with lists of values. For instance, maybe we're a teacher with a grading program, and we have a list of all the grades a student has received for the semester. In C++ we can do this with an array. At a very high level, that's basically what an array is, a list of values with compatible types. Each value in the array has a numerical index. To get, or set, a specific value in the array, you need to specify it's index with an RHS value. Array indices in C++ must be of type int.
Contents |
Arrays as variables
Arrays in C++ must have a type, just like all other values. Every value in the array must be compatible with this type. When we declare a variable to be used as an array, we must specify this type, just like with all variables. To designate that variable is an array, we use square brackets, as shown below:
int x []; //An array of int-type values
However, the above code is actually incorrect. In C++, arrays must have a declared length. We designate this by placing the length inside the square brackets. An important note is that the length of an array must be known at compile time so the compiler knows how must room to set aside for it. That means you can't use variables or result values, you can only use literals, and you must use a literal that evaluates to an integer type. A common practice is to used #define replacements for this purpose because the length of an array is something you may need to use frequently. Using a #define for this purpose will drastically reduce the amount of work you have to do if you ever need to change the length. Just remember the usual warning about #define.
So the following code will compile, and run, it just won't do anything as far as you can tell.
#define MY_ARRAY_LENGTH 10 int main(){ //Use a #define for array length int x [MY_ARRAY_LENGTH]; //We can also use integer type literals directly int y [12]; }
Array literals
Just as we have literals for the various primitive types, we can create literal array values as well. An array literal is wrapped in curly braces ( { and } ), and is populated with a comma separated list of values. However, the only time we can use an array literal is for an inline assignment of an array variable as it's being declared. In this case, and only this case, we don't need to explicitly specify the length of the array variable, because it is implicitly declared by the array literal.
Here's an example of declaring an array variable with an array literal. Note that you can use any RHS value inside the array literal:
int main(){ int y; y = 15; //Declare and define an array with an array literal. int x [] = {12, 13, 14, y, 16}; }
The following code, however, will not work, because an array literal must be inline with the declaration:
int main(){ int x [5]; x = {12, 13, 14, 15, 16}; }
Accessing elements in an array
An array is what's known as a 'random access data structure, because you can always access any piece of data in it. This is like the Random Access Memory (RAM) in your computer: the processor can access any of the data in the memory at any time. There are other data structures where you can only access the most recently added piece of data, or others where you can only access the oldest piece of data.
Retrieving values from an array
So how do we achieve this random access? Like I said in the introduction, with the index. An index into an array is just an integer number that indicates which element we would like to access. Remember, arrays are like lists, they're ordered. Each element in the array has a numerical position, a rank if you prefer. To get to it, all we need to do is know it's rank.
Each location in the array is like an unnamed variable: it can be LHS or RHS. In other words, we can assign values to any location in the array, and we can retrieve a value from any location. To indicate which position—to provide an index—we use square brackets again, following the array variable's name, with the index inside the brackets. In this case, we can use any integer type RHS value for the index, it does not need to be known at compile time (that was only for declaration so the compiler new how much memory to set aside). Try the example below:
#include <iostream.h> int main(){ int x [] = {23, 24, 25, 26, 27}; int i; i = 0; cout << x[i] << endl; //index provided by variable cout << x[i+1] << endl; //index provided by result of operation cout << x[2] << endl; //index provided by immediate }
If you run this, you should get the numbers 23, 24, and 25 printed one three consecutive lines. Notice that the first line corresponds to the element in the array named x at an index specified by i, and it printed a value of 23, which is the first element in the array (according to the array literal). But look up a few lines and notice we defined i as 0. Yes, that's right. For reasons which will become more and more clear to you as you continue on through programming, C++ and many other programming languages use 0 (zero) as the first index into an array. This is called zero-based indexing and as confusing as it may be sometimes, it's incredibly useful. The second element has an index of 1, the third is indexed at 2, etc.
Printing an array
Now printing an array is a bit of a procedure. Feeding it directly to cout the way you would another value doesn't work as you might hope. You can go ahead an try it if you want, but it's not going to simply show you a nice pretty list of values. It'll probably show you an 8 character string of digits and letters from A to F. Don't worry about that now, just know it's not what we want. Instead, what we'll have to do is print each element individually.
To do this, we'll use our new friend, the for loop. We could use any of our loops, but a for loop is well suited to iterating over an array. Think about what we actually want to do: create a variable that starts at our first index (the for loop initializer), increment our index variable each loop (the foor loop incrementor), and stop if our index variable goes past the end of the array (the for loop test). Try the sample code below:
#include <iostream.h> #define ARRAY_LENGTH 5 int main(){ int x [] = {23, 24, 25, 26, 27}; //routine to "pretty-print" the array int i; cout << "{ "; for(i=0; i<ARRAY_LENGTH; i++){ cout << x[i] << " "; } cout << "}" << endl; }
you should get:
{ 23 24 25 26 27 }
Note that our for loop test is checking to see if the index is less than the length of the array. Remember, the test gets performed at the beginning of each iteration, and the loop keeps going if the test is true. Since our arrays use zero-based indexing, the last element in the array is going to have an index equal to one less than the length of the array. So in this case, when i gets up to 4, it will be less then the length (5), so the loop will iterate again, printing the element at index 4, which is the last index. The next time, the incrementor will assign i a value of 5, and the test will fail, and we'll break the loop.
Assigning values into an array
Assigning value to a position in the array is easy. It's done the same way we assign to any variable, except in this case, our variable is unnamed, so we just have to specify the array, and the index into the array. Try the code below which is based off of our previous example. It create the same array, prints it, reassigns some of the values in it, then prints it again.
#include <iostream.h> #define ARRAY_LENGTH 5 int main(){ int x [] = {23, 24, 25, 26, 27}; //routine to "pretty-print" the array int i; cout << "{ "; for(i=0; i<ARRAY_LENGTH; i++){ cout << x[i] << " "; } cout << "}" << endl; //assign some new values x[0] = 15; x[4] = 30; //Print it again cout << "{ "; for(i=0; i<ARRAY_LENGTH; i++){ cout << x[i] << " "; } cout << "}" << endl; }
Run that and you should get:
{ 23 24 25 26 27 }
{ 15 24 25 26 30 }
As with retrieving values, indices for assigning values can be any integer RHS value.
Some more details about arrays
Memory layout
So once we get past the square brackets, and the indices, and the array literals, what exactly is an array? How does the computer store an array? That's an excellent question and I'm glad you asked. What an array really is, is just a continuous block of memory inside the computer. That's it, it's just a bunch of values stuck back to back in your computer's memory. The first element in the array exists at a certain location in memory. When you specify an index into an array, all you're really doing is telling the computer how far away from that initial memory location you want to go. In this case, it's not measured in bytes, though, it's measured in elements. But only on your side. When you tell the compiler (with you source code) to go to index 2 of an int-array, it's going to convert that index into a certain number of bytes based on the size of an int. If an int is four bytes, as it often is, than index 2 of an int-array corresponds to an offset of 2*4bytes = 8 bytes. And that's what the compiler is actually going to tell the computer.
Out of bounds
So what happens if you specify an index past the end of the array? You're basically asking for trouble, but the compiler won't try to stop you. Neither will the computer itself, because it has no idea that a certain location in memory is supposed to be "an array", it's all just memory to the computer. So when you declared your array, with a specific length, the compiler is going to set aside a block of memory big enough to hold that array, with the length you told it. Now you're going to try to access it passed that length, and the compiler is going to generate the correct offset to do that, and the computer's going to go ahead and do it.
If you're lucky, the generated offset will place you into some protected area of memory. When the computer tries to access it, in this case, you're program is going to crash. And this is the lucky scenario? Yeh, it is. If you're not lucky, instead of landing in some protected part of memory, you'll land in some unprotected part of memory where valuable data is being stored. If you try to right to this out of bounds array index, you'll end up overwriting that other data, which could be very bad if, for instance, that data is the corruption protection keyword that prevents your entire system from going into melt down.
On the other hand, if you try to read from out of bounds, you might pull up what you think is valid data (it was stored "in" your array after all), but is actually just garbage. This is pretty bad news if, let's say, your array is supposed to contain sampled values of the electrical current running through a managed power supply. If you pull up garbage data that happens to correspond to 0 amps, instead of the actual data which was 12.5 amps, you're going to think you're system is loosing power, you're going to drive some gates high, and send 25 amps rushing through your 15 amp system.
Some programming languages have built in run time checking to make sure this doesn't happen. C++ does not. Not being aware of this, or being aware and not doing anything about this, can leave your program, and more importantly your users, extremely vulnerable. This was a major issue with Internet Explorer and some other web browsers for years. The array that held the URL was only 128 bytes long (which is the maximum "legal" length of a URL), but there was no bounds checking. Some people figured this out and began creating specially crafted URLs longer than this. When a browser tried to go to this URL, it wrote the too-long value into the too-small array. The extra bytes just continued to get written into unknown sectors of memory, which can lead to all sorts of unfortunate scenarios even more dire than the ones described above.
So be very careful about array indexing and bounds checking. The safest way is to specifically check all indices before accessing them, to make sure they are within the valid range. Which, by the way, means no values less than 0. Because no, the C++ compiler won't complain about negative indicies, either.
Finding the length of an array
It may sometimes be useful to determine at runtime the length of an array. I have a way of doing that, but first, let me say that this is not common practice, and probably isn't a good idea. Second, let me say that this only works for arrays which have a statically defined length (defined at runtime, like everything we've talked about so far), in which case you should already know the length.
That said, you can determine the total number of bytes set aside for an array using the sizeof array. For instance, an int-array of length 5 (assuming ints are 4 bytes), has a total size of 20 bytes (5 entries * 4 bytes per entry). This will be given to you by the sizeof operator. To determine the number of elements in the array, you will need to determine the sizeof the elements in the array. All elements are the same type, so they're all the same size. You can therefore figure this out by using sizeof on the first element. The length of the array is just the sizeof the entire array, divided by the sizeof the first element.
