Termipankki
  1. A
    1. API
    2. ASAN
    3. Access specifiers
  2. C
    1. Constructor
    2. const
  3. G
    1. GDB
    2. g++
  4. M
    1. Make
    2. Memory Leak
  5. P
    1. Pointer
  6. R
    1. Recursion
  7. S
    1. std
  8. V
    1. Valgrind
Ratkaistu: / tehtävää

LAB_MEMORY: Leak-Free Code with ASAN and Valgrind

Due date: 2024-04-02 23:59.
Welcome to the lab!
In this assignment, you will learn about two memory checking utilities: AddressSanitizer (ASAN) and Valgrind. These tools will help you identify and fix memory errors in your code. You will also learn about different types of memory leaks and memory errors, and how to fix them.
Please note that starting from this assignment on, our autograder will always check your assignments for memory errors and leaks. If the code you submit has memory leaks and/or memory errors it will not pass the autograder. Therefore, remember to check your code with ASAN or Valgrind before submitting the code.
Notice the difference between a memory leak and a memory error. A memory leak occurs when your program allocates memory but fails to deallocate it, while a memory error occurs when your code accesses memory that it shouldn't. While ASAN is effective at detecting memory errors, it does not have a specific feature to detect memory leaks, and it may not be able to detect certain types of issues that Valgrind will. This lab teaches you how to use both ASAN and Valgrind to effectively manage memory related issues in your code.

Lab Description

For this lab, you will be fixing bugs in our Student-To-Room allocation program. Since our course has many students, exams are usually spread across several rooms. Before the exam, we have to allocate different students to different rooms, so that everyone can take the test with enough space.
For example, if there were only two classrooms of equal size, students in the first half of the alphabet (last name letters A - N) might go to L5, while students in the second half of the alphabet (letters M - Z) might go to PR105.
However, with more rooms, this problem becomes more difficult. In the sample situation provided, there are 7 classrooms for the exam, varying in seating capacity from 80 to 200 seats (i.e. 40 to 100 students seated every-other-desk). Although we'll have to break up the alphabet more, we'd still like to assign students with the same first letter of their last name to the same room, as this makes going to the right room easier.
We've provided you the code to solve this problem, however, it has several memory bugs in it. You'll have to use ASAN and/or Valgrind, as well as some debugging skills from LAB_DEBUG, to find the bugs and fix them. Note, there are no bugs in the fileio namespace.

Download The Provided Files

Just like you already did it in previous labs, start with downloading and unzipping the provided files from here: lab_memory.zip.
The code for this activity resides in the lab_memory directory. Get there by typing the following line in the terminal while in your working directory:
cd lab_memory
You should now have the following files in your lab_memory directory:
You will only need to modify four files for this lab: room.h, room.cpp, allocator.h, and allocator.cpp.

Background on ASAN

ASAN can detect common memory errors such as use-after-free and buffer overflows, and also helps you practice implementing the “big three” memory management functions (copy constructor, assignment operator, and destructor). We will list common memory errors that result in segfaults and other crashes below; ASAN can be quite effective in troubleshooting these.

1. Out-of-bounds access to heap, stack, and globals

This error occurs when you allocate some memory and then try to access a region outside your allocated space.
// Example of out-of-bounds access
int * arr = new int[100];
return arr[100]; //NOPE 

2. Use-after-free (dangling pointer reference)

This error occurs when you try to use memory you have already freed. It is especially helpful when you have several pointers referring to the same memory.
// Example of use-after-free
int * arr = new int[100];
delete [] arr;
return arr[5]; //NOPE 

3. Heap buffer overflow

This error occurs when you go out of bounds within an array created on the heap.
// Example of heap buffer overflow
int * arr = new int[100];
arr[0] = 0;
int ret = arr[5 + 100]; //NOPE
delete [] arr;
return ret; 

4. Stack buffer overflow

This error occurs when you go out of bounds within an array created on the stack.
// Example of stack buffer overflow
int arr[100];
arr[1] = 0;
arr[5 + 100]; //NOPE 

5. Use after return

This error occurs when you return a stack variable at the end of a function and try to use it after it's out of scope.
// Example of use after return
int * arr1;
void func() {
int arr2[100];
arr1 = arr2;
}
int main() {
func();
return arr1[10]; //NOPE
} 

6. Double free or invalid free

Double free occurs when you free an heap object twice. Invalid free's occur when you free a non-heap object.
// Example of double free
int * arr = new int[100];
delete [] arr;
delete [] arr; //NOPE 
// Example of invalid free
int arr[100];
delete [] arr; //NOPE 

7. Memory leak detection

ASAN can detect three sources of memory leakage.

Fixing Memory Bugs with ASAN

Now that you are familiar with ASAN you are ready to start using it to fix the memory bugs in the provided code. Start with compiling the code.

Compiling the Code

We provided you with the Makefile that will compile the code into two executables: allocate and allocate-asan. Run make to compile:
make
The allocate is the program that will run allocation of students from the students.txt file into the rooms from the rooms.txt file. The allocate-asan is the same allocation program, but compiled with options to enable ASAN.
NOTE: To quickly clear out any binaries in your directory type:
make clean
You can see that all of the object (.o ) and executable files in your directory are cleared out.

Running the Code

Because the provided code has memory bugs, once you run the allocate you will see segmentation fault in the terminal:
./allocate

Segmentation fault
Try running the program with ASAN instead:
./allocate-asan
You will see a trace of the first invalid memory access. ASAN terminates the program upon the first such invalid memory access. You'll have to fix it before moving on to other errors. This incremental procedure should help you step through memory bugs one at a time.
Keep in mind, the actual error might be in a different location in the code, not the exact line the ASAN provides upon the crash.
Good luck!

Testing the Code

Once you have fixed all the memory errors, you can test your program output using:
./allocate > output.txt
diff output.txt soln_output.txt
Note that most of the work in this lab consists of fixing memory errors and memory leaks, rather than the program output. The code should run and return correct output once you fix all of the memory bugs.
You may be able to fix ALL of the bugs in this lab using only ASAN, therefore, the next section is optional. You can scroll down to submit your code. However, if you want to learn how to use Valgrind, or if you are having difficulties with using ASAN, move on to the next section.

Background on Valgrind (Optional)

Valgrind is a free utility for memory debugging, memory leak detection, and profiling. It runs only on Linux systems. Many students in the past found Valgrind to be useful in detecting memory errors and leaks. While it performs a similar function to ASAN, it differs in that it does not embed the memory checking code into the binary executable. Instead, it interprets the code during runtime, which can make it slower than ASAN.
To prepare your project to be examined by Valgrind you need to compile and to link it with the debug options -g and -O0. We provided you with the Makefile that uses these options when compiling. In order to test your program with Valgrind you should use the following command:
valgrind --leak-check=full ./allocate 
You will see a report about all found mistakes or inconsistencies. Each row of the report starts with the process ID (the process ID is a number assigned by the operating system to a running program). Each error has a description, a stack trace (showing where the error occurred), and other data about the error. It is important to eliminate errors in the order that they occur during execution, since a single error early could cause others later on. Here is a list of some of the errors that Valgrind can detect and report. (Note that not all of these errors are present in the exercise code.)

Invalid read/write errors

This error will happen when your code reads or writes to a memory address which you did not allocate. Sometimes this error occurs when an array is indexed beyond its boundary, which is referred to as an "overrun" error. Unfortunately, Valgrind is unable to check for locally-allocated arrays (i.e., those that are on the stack.) Overrun checking is only performed for dynamic memory.
// Example of an invalid write error
int * arr = new int[6];
arr[10] = -5; //NOPE

Use of an uninitialized variable

This type of error will occur when your code uses a declared variable before any kind of explicit assignment is made to the variable.
// Example of using an uninitialized variable.
int x;
cout << x << endl; //NOPE

Invalid free error

This occurs when your code attempts to delete allocated memory twice, or delete memory that was not allocated with new.
// Example of an invalid free
int * x = new int;
delete x;
delete x; // NOPE

Mismatched new/free/delete

Valgrind keeps track of the method your code uses when allocating memory. If the memory is then deallocated with different method, that is, a wrong call to free(), delete, or delete [] is used, you will be notified about the error.
// Example of mismatched new and delete
int * x = new int[6];
delete x; //NOPE

Memory leak detection

Valgrind can detect three sources of memory leakage.
//Example of a memory leak
int * x = new int[6]; // a memory leak if there is no corresponding delete [] x

Further Reading

More information about the Valgrind utility can be found at the following links:

Fixing Memory Bugs with Valgrind (Optional)

Note that if you were using any ASAN compilation options, you want to clear out the binaries first:
make clean
Next, compile your program:
make
This will create an executable file called allocate, which you can run with:
./allocate
You can then run Valgrind on ./allocate:
valgrind ./allocate
This works fine for fixing the memory errors, however, to fix the memory leaks, you'll need to add --leak-check=full before ./allocate:
valgrind --leak-check=full ./allocate
Once you have fixed all the Valgrind errors, you can test your program output using:
./allocate > output.txt
diff output.txt soln_output.txt
Once again, note that most of the work in this lab consists of fixing memory errors and memory leaks, rather than the program's output, which should run correctly once the memory errors are fixed.

Submitting Your Work

You can now upload your allocator.cpp, allocator.h, room.cpp, room.h in the box below. All other files, including main.cpp and any testing files you have added will not be used for grading.
Before you submit, make sure that no bugs are reported by either ASAN or Valgrind when you compile and run your code. Additionally, double-check that the output of your program matches the contents of soln_output.txt exactly.

Sallitut tiedostojen nimet allocator.cpp, allocator.h, room.cpp, room.h

Varoitus: Et ole kirjautunut sisään. Et voi vastata.

Acknowledgments

We would like to express our gratitude to prof. Cinda Heeren and the student staff of the UIUC CS Data Structures course for creating and sharing the programming exercise materials that have served as the basis for the labs used in this course.
Revised by: Elmeri Uotila and Anna LaValle
?
API stands for Application Programming Interface. In the context of this course, you can consider the header files to be such interfaces, as they determine what class functions and properties are public and thus accessible from anywhere.
AddressSanitizer (ASAN) is a memory error detector for C/C++. In this course, the makefiles will typically compile an executable that uses ASAN, with "-asan" at the end of its name.
The two notable access specifiers are:
  • public: class members defined after the public keyword are accessible from outside the class.
  • private: class members are generally private by default and thus not accessible from the outside
Constructor is a special non-static member function of a class that is used to initialize objects of its class type. A constructor is called upon initialization of an object. A constructor without arguments is a default constructor, but constructors that take arguments can be defined.
GDB, the GNU Project debugger, allows you to see what is going on `inside' another program while it executes -- or what another program was doing at the moment it crashed.
GDB can do four main kinds of things (plus other things in support of these) to help you catch bugs in the act:
  • Start your program, specifying anything that might affect its behavior.
  • Make your program stop on specified conditions.
  • Examine what has happened, when your program has stopped.
  • Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.
GNU Make is a tool which controls the generation of executables and other non-source files of a program from the program's source files. Make gets its knowledge of how to build your program from a file called the makefile, which lists each of the non-source files and how to compute it from other files. When you write a program, you should write a makefile for it, so that it is possible to use Make to build and install the program.
Memory leak means that the program is not freeing up memory it reserves. The memory will be freed when the program terminates, but if a program keeps leaking more and more memory without terminating, it can become a huge issue!
A typical way for memory leaks to occur is reserving memory with new and not calling delete before the pointer goes out of scope.
Pointer variables store a memory address as their value. In other words, they point to some data. The data can be accessed by dereferencing the pointer. (Either like *p or p->...)
A recursive function calls itself from within it. The recursion call must be conditional or it would lead to an infinite loop.
Valgrind is another tool besides ASAN that you can use in this course. It can detect many memory-related errors that are common in C and C++ programs and that can lead to crashes and unpredictable behaviour.
const is a keyword meant to mark something as immutable
  • A const object cannot be modified: attempt to do so directly is a compile-time error, and attempt to do so indirectly (e.g., by modifying the const object through a reference or pointer to non-const type) results in undefined behavior.
  • const keyword on an object's member function prevents the object from being modified in the function
  • Pointer declarations can have 2 const keywords, one to make the data that's pointed at unable to be modified and one to make the pointer itself unable to be modified
Using const improves code readability and prevents accidental modification of objects.
g++ is a C++ compiler that we primarily use for this course. It is the C++ compiler for the GNU compiler collection. You may sometimes see gcc being used instead of g++, which was originally the GNU C compiler, but calling gcc generally also compiles .cpp files as C++. However, calling g++ is preferred.
In C++, std stands for Standard Library, which is a collection of commonly useful classes and functions. Typically, these are defined in the std namespace.