Implementing C Checkers with PySenpai¶

This is a standalone guide, so a lot of the information will be repeated from -- WARNING: BROKEN LINK --the Python checkers guide. Python and C checkers are mostly similar. The most important differences are related to using pointers and structures.

As Lovelace is installed on a Linux server, it is recommended that you develop and test your checkers in a Linux environment. This is more true for C checkers than Python checkers because you have different compilers and other potential issues to deal with.

Loading Student C Code¶

PySenpai uses Python C foreign function interface (CFFI) to call C functions from within a Python program. This process involves compiling the student deliverable into a shared library object (.so in Linux, .dll in Windows) and using function prototypes within the code file to discover functions. This means that student files must define function prototypes.

Currently there are two ways to load a library with the C extension. In one of them the student code is compiled into a library manually, in the second this is handled by CFFI. The default guide will use the former approach as the latter currently relies on a deprecated function of CFFI but its use is the only option for certain types of tests. When compiling manually, the following command should be used as the basis:

gcc -Wall -Werror=missing-prototypes -shared -Wl,-soname,st_library -o st_library.so -fPIC $RETURNABLES

Where st_library can be replaced by a name of your choosing, as long you as you change both instances of it. This is the name that is used in the checker. Note that the library file must be found within the environment path in order for any of this to work. For checkers operating inside Lovelace this doesn't matter because everything is contained within the same temporary directory. However, while developing checkers on your computer this may require some adjustments - or just copy the library files to the PySenpai root.

Also note that the command line argument(s) that indicate the file(s) to be checked are the source code files (i.e. the ones in $RETURNABLES if inside Lovelace)

Checker Basics¶

In this section we'll go through the basics of implementing a C checker, and some of the more common and simpler customizations. The file shown below is the skeleton of a minimal checker that tests a function from the student submission.

c_skel.py

import random
import test_core as core
import c_extension as c
from cffi import FFI

ffi = FFI()

st_function = {
    "fi": "",
    "en": ""
}

so_name = {
    "fi": "",
    "en": ""
}

msgs = core.TranslationDict()

def gen_vector():
    v = []
    return v

def ref_func():
    return None

if __name__ == "__main__":
    files, lang = core.parse_command()
    st_lib = c.load_library(files[0], so_name, lang)
    if st_lib:
        c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang)

Starting from the top, the st_function dictionary contains the name of the function to be tested, in each of the available language. The keys are standard language codes. Even if only one language is supported, the function name still must be given as a dictionary. The so_name is the name of the library without extension that was compiled in the previous stage.

The skeleton also creates another dictionary for messages. This is a subclass dictionary that has methods for accessing the same key with multiple languages. The methods are:

set_msg(key, lang, message_object)
get_msg(key, lang[, default])

This is the type of dictionary that is expected as the argument for custom_msgs in the various testing functions. Very basic checkers may not need to set any messages, in which case this can be ignored.

Next up are two mandatory functions. The first generates the test vector which in turn determines the number of test cases. The second is the reference function. For simple checkers this is a Python version of the function from your reference solution. Details and tips for implementing both are given in later sections. There are a lot of other functions that can be defined here, but these are the mandatory ones.

The section under if __name__ == "__main__": contains the code that actually executes the test preparations and tests themselves. The parsing of command line arguments is done by PySenpai's parse_command function which returns the names of source code files and the language argument. If only one file is returned, the first one is the one to check. If the exercise requires multiple files, then you will need some way of figuring out which is which.

The load_library function does the necessary preparations that allow the rest of the tests to call C functions defined in the student code. This function returns an interface object that exposes the C functions as callable Python attributes - from the checker's perspective it works mostly just like a Python module. However, one limitation of load_library is that it cannot expose global variables from the student code. For this you need to use load_with_verify (see separate section).

In the skeleton one function is called (there's not lint test for the C extension yet). Note that test_c_function has a lot of optional parameters for various purposes - we'll get to each of them eventually in this guide chapter.

If we fill in the function name dictionary and the two functions, we have ourselves a basic checker. Below is an example that checks a function that calculates kinetic energy (it's a C version of the example that was used for Python checkers).

kinetic_test_basic_c.py

import random
import test_core as core
import c_extension as c
from cffi import FFI

ffi = FFI()

st_function = {
    "fi": "laske_kineettinen_energia",
    "en": "calculate_kinetic_energy"
}

so_name = {
    "fi": "energia",
    "en": "energy"
}

msgs = core.TranslationDict()

def gen_vector():
    v = []
    for i in range(10):
        v.append((round(random.random() * 100, 2), round(random.random() * 100, 2)))
    
    return v

def ref_func(v, m):
    return 0.5 * m * v * v

if __name__ == "__main__":
    files, lang = core.parse_command()
    st_lib = c.load_library(files[0], so_name, lang)
    if st_lib:
        c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang)

About Test Vectors¶

A test vector should contain one list of arguments/inputs for each test case. The number of test cases is directly derived from the length of the test vector. Note that the test vector must always contain lists (or tuples) even if the function being tested only takes a single argument - the function is always called by unpacking the argument list. This is the line that calls the student function:

res = st_func(*args)

So the entire test vector has to always be a list of sequences.

A good test vector has multiple cases, some of which are entirely randomly generated and some of which cover the edge cases that need to be tested specifically. Randomness makes it impossible for students to make code that tries to pass the test instead of actually implementing what was required, while covering edge cases makes sure that partially functioning solutions don’t get accepted accidentally if they get a favorable random test vector.

There are no strict guidelines to how many test cases should be in a checker. For simple checkers, we've been using 10 cases. If the exercise is complex enough that running some or all cases takes a noticeable amount of time, it's best to keep the checking time to a minimum. The default timeout for checker runs is 5 seconds and while this can be changed, it should only be done if necessary.

In C checkers test vectors have the additional complication with data types. Simple numbers work straight away, although Python floats are always doubles. Strings on the other hand need to be converted into char arrays, and pointers and structs are also more complicated. These will be described separately. Lists of numbers are converted by CFFI into arrays automatically when calling C functions.

Changing the Validator¶

The default validator of PySenpai, called result_validator, is adequate for most checkers where no fuzziness is required in validation, although for C checkers it only really works with numbers. However, there are times when changing the validator is needed (or recommended). Our example here has floating point numbers as results. This runs the risk of rounding errors if the student implementation is different. For this reason, it would be safer to use rounding_float_result_validator as our validator. To do so, we'll change the test_c_function call to:

c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang, validator=core.rounding_float_result_validator)

There are a couple of other validators available:

parsed_result_validator: validates values parsed from student function's output instead of its return value
parsed_list_validator: validates a list of values parsed from the output against a list returned by the reference function

Both are still adequate for C checkers as long as proper output parsing is provided. If you need more elaborate validation, it's time to implement a custom validator. Details are provided in a later section.

Using Input¶

In this section we'll go through how to test functions that read inputs from the user. There's actually no difference between C and Python checkers in this regard because it's just Python strings being written to a stream. The example is simpler however. Extending the previous example, let's assume we want to add another function to the exercise. This function will prompt the user for a positive float number.

On the checker side this means we need to add another set of basics function test components. This time two vectors are needed: one for arguments to the function and another for inputs. The input vector contains the values that will be fed to the student's input function calls through a faked stdin. The stdin for each test run will be formed from the input vector by joining it with newlines. This line in the test_c_function code does it:

sys.stdin = io.StringIO("\n".join([str(x) for x in inps]))

Since str will be called for each value in the input vector, the vector can contain any types of values. This makes implementing the test somewhat easier. The following example does randomization rather thoroughly:

def gen_input_vector():
    v = []
    v.append((round(random.random() * -100, 3), round(random.random() * 100, 3)))
    for i in range(9):
        case = []
        for j in range(random.randint(0, 2)):
            case.append(round(random.random() * -100, random.randint(1, 4)))
            
        case.append(round(random.random() * 100, random.randint(1, 4)))
        v.append(case)
    
    random.shuffle(v)
    return v

What about the reference? By default it would only be given the argument. However this is not very useful because the information we actually want is in the input vector. In order to get the input vector passed to the reference, we need to set the reference_needs_input keyword argument to True when calling test_function. This change needs to be reflected in the reference function - it now gets two arguments: the list of arguments and the list of inputs. The reference in this case is actually very simple:

def ref_prompt(args, inputs):
    return inputs[-1]

This is because we know from implementing the test vector that the last item is always the proper input while everything before it is not. Therefore, when the student function is working correctly it should return the same value. All that's left is to call test_c_function with two new keyword arguments - one for providing the inputs, and another for telling the function that our reference expects to see both vectors:

c.test_c_function(st_lib, st_prompt_function, [[]] * 10, ref_prompt, lang,
    inputs=gen_input_vector,
    ref_needs_inputs=True
)

The updated example is below:

kinetic_test_input_c.py

import random
import test_core as core
import c_extension as c
from cffi import FFI

ffi = FFI()

st_function = {
    "fi": "laske_kineettinen_energia",
    "en": "calculate_kinetic_energy"
}

st_prompt_function = {
    "fi": "pyyda_liukuluku",
    "en": "prompt_float"
}

so_name = {
    "fi": "energia",
    "en": "energy"
}

msgs = core.TranslationDict()

def gen_vector():
    v = []
    for i in range(10):
        v.append((round(random.random() * 100, 2), round(random.random() * 100, 2)))
    
    return v

def gen_input_vector():
    v = []
    v.append((round(random.random() * -100, 3), round(random.random() * 100, 3)))
    for i in range(9):
        case = []
        for j in range(random.randint(0, 2)):
            case.append(round(random.random() * -100, random.randint(1, 4)))
            
        case.append(round(random.random() * 100, random.randint(1, 4)))
        v.append(case)
    
    random.shuffle(v)
    return v

def ref_prompt(args, inputs):
    return inputs[-1]

def ref_func(v, m):
    return 0.5 * m * v * v

if __name__ == "__main__":
    files, lang = core.parse_command()
    st_lib = c.load_library(files[0], so_name, lang)
    if st_lib:
        c.test_c_function(st_lib, st_prompt_function, [[]] * 10, ref_prompt, lang,
            inputs=gen_input_vector,
            ref_needs_inputs=True
        )
        c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang)

Using Output¶

Another way to obtain results from a student function is to parse values from its output. This is also exactly the same as doing this in Python checkers, because it's only Python code interacting with a string. Typically validating output involves implementing a parser function that obtains the relevant values, and changing the validator to a one that uses these values instead of the function's return values. To show how to do this, we're going to make a checker for a function that prints all even or odd numbers from a list.

The parser is a function that receives the raw output produced by the student function and returns values that will be validated against the reference. In this example, it should be a function that finds all integers from the output and returns them as a list. Most of the time this is done using the findall method of regular expression objects.

int_pat = re.compile("-?[0-9]+")

def parse_ints(output):
    return [int(v) for v in int_pat.findall(output)]

Note that it is always better to be more lenient in parsing than in validating. For example, if your exercise demands floating point numbers to be printed with exactly 2 decimal precision, your parser should still return all floats from the output. Incorrect precision should be caught by the validator instead. Also it's best to make sure your parser doesn't miss values due to unexpected formatting - at least within reasonable limits. It's confusing to the students if their code prints values but the checker claims it didn't find them. In cases where you absolutely want to complain about the output being unparseable, you can raise OutputParseError from the parser function. This will abort the current test case and mark it incorrect.

Implementing a reference for output validation is no different than implementing one for a function that returns the values instead. You can see the reference in the example at the end. A few adjustments are needed when validating output instead of return value. First, the validator needs to be changed to parsed_result_validator (or some other that validates outputs). Second, some messages need to be adjusted. By default the messages talk about return values. Here's a set of replacements for tests that validate outputs instead. In the future we may add an option to use these by setting a flag when calling test_function.

custom_msgs = core.TranslationDict()
custom_msgs.set_msg("CorrectResult", "fi", "Funktio tulosti oikeat arvot.")
custom_msgs.set_msg("CorrectResult", "en", "Your function printed the correct values.")
custom_msgs.set_msg("PrintStudentResult", "fi", "Funktion tulosteesta parsittiin arvot: {parsed}")
custom_msgs.set_msg("PrintStudentResult", "en", "Values parsed from function output: {parsed}")
custom_msgs.set_msg("PrintReference", "fi", "Olisi pitänyt saada: {ref}")
custom_msgs.set_msg("PrintReference", "en", "Should have been: {ref}")

One additional hurdle in C checkers is that in order for output redirection to work, load_library needs to do some sorcery. The sorcery is enabled by setting the keyword argument req_stdio to True. When this option is enabled, load_library will report an error if the student code doesn't have an include for stdio. In very few words, the C extension needs to call the setbuf function for stdout after loading the library to set the output buffer to NULL (i.e. no buffering).

With these in mind, we can call test_c_function. Because the return value of the function is always None, setting test_recurrence to False is called for. Doing so avoids complaining to the student that their function always returns the same value.

c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang,
    custom_msgs=msgs,
    output_parser=parse_ints,
    validator=core.parsed_result_validator,
    test_recurrence=False
)

<!file-llguide-even-odd-c-py>

Validating Messages¶

The previous section covered how to validate values that are parsed from output. What about validating things like error messages? While you can technically do it with a normal validator, using a separate message validator is recommended. When the two validations are separated the student has a better idea of where the problems in their code are. Message validators operate with knowledge of arguments and inputs given to the student function, and its full output. Just like a normal validator, a message validator should also make assert statements about the output. Because message validators often deal with natural language, maximum leniency is recommended.

Getting back to our kinetic energy example, let's assume we've instructed the students to give an error message when prompting input: "Value must be positive" if it's negative. Usually these kinds of validators either use regular expressions or dissect the string in some other way. When using regular expressions, one way is to put them in a TranslationDict object.

msg_patterns = core.TranslationDict()
msgs_patterns.set_msg("negative", "fi", re.compile("positiivinen"))
msgs_patterns.set_msg("negative", "en", re.compile("positive"))

The validator itself should go through the inputs and see that there's a proper error message for each improper input. However before doing that it should also check that there's a sufficient number of prompts - otherwise this validator will cause an uncaught exception (IndexError for lines[i])

def error_msg_validator(output, args, inputs):
    lines = output.split("\n")
    
    assert len(lines) >= len(inputs), "fail_insufficient_prompts"
    for i, value in enumerate(inputs):
        if value < 0:
            assert msg_patterns.get_msg("negative", lang).search(lines[i]), "fail_negative"

Using "fail_" to prefix message names is a convention. These handles will correspond actual messages in the custom messages dictionary. Here you can also see how to add hints to a message - the value is now a dictionary instead of a string. More details about this can be found from a separate section.

msgs = core.TranslationDict()
msgs.set_msg("fail_insufficient_prompts", "fi", dict(
    content="Funktio ei kysynyt lukua tarpeeksi montaa kertaa.",
    hints=["Tarkista, että funktio hylkää virheelliset syötteet oikein."]
))
msgs.set_msg("fail_insufficient_prompts", "en", dict(
    content="The function didn't prompt input sufficient number of times.",
    hints=["Make sure the function rejects erroneous inputs properly."]
))
msgs.set_msg("fail_negative", "fi", "Negatiivisestä syötteestä kertova virheviesti oli väärä.")
msgs.set_msg("fail_negative", "en", "The error message for negative input was wrong.")

Because we're dealing with student code that has output, load_library needs req_stdio set to True. After this we can again call test_c_function.

c.test_c_function(st_lib, st_prompt_function, [[]] * 10, ref_prompt, lang,
    inputs=gen_input_vector,
    ref_needs_inputs=True,
    message_validator=error_msg_validator
)

Full example:

kinetic_test_messages_c.py

import random
import re
import test_core as core
import c_extension as c
from cffi import FFI

ffi = FFI()

st_function = {
    "fi": "laske_kineettinen_energia",
    "en": "calculate_kinetic_energy"
}

st_prompt_function = {
    "fi": "pyyda_liukuluku",
    "en": "prompt_float"
}

so_name = {
    "fi": "energia",
    "en": "energy"
}

msg_patterns = core.TranslationDict()
msg_patterns.set_msg("negative", "fi", re.compile("positiivinen"))
msg_patterns.set_msg("negative", "en", re.compile("positive"))

msgs = core.TranslationDict()
msgs.set_msg("fail_insufficient_prompts", "fi", dict(
    content="Funktio ei kysynyt lukua tarpeeksi montaa kertaa.",
    hints=["Tarkista, että funktio hylkää virheelliset syötteet oikein."]
))
msgs.set_msg("fail_insufficient_prompts", "en", dict(
    content="The function didn't prompt input sufficient number of times.",
    hints=["Make sure the function rejects erroneous inputs properly."]
))
msgs.set_msg("fail_negative", "fi", "Negatiivisestä syötteestä kertova virheviesti oli väärä.")
msgs.set_msg("fail_negative", "en", "The error message for negative input was wrong.")

def gen_vector():
    v = []
    for i in range(10):
        v.append((round(random.random() * 100, 2), round(random.random() * 100, 2)))
    
    return v

def gen_input_vector():
    v = []
    v.append((round(random.random() * -100, 3), round(random.random() * 100, 3)))
    for i in range(9):
        case = []
        for j in range(random.randint(0, 2)):
            case.append(round(random.random() * -100, random.randint(1, 4)))
            
        case.append(round(random.random() * 100, random.randint(1, 4)))
        v.append(case)
    
    random.shuffle(v)
    return v

def ref_prompt(args, inputs):
    return inputs[-1]

def ref_func(v, m):
    return 0.5 * m * v * v

def error_msg_validator(output, args, inputs):
    lines = output.split("\n")
    
    assert len(lines) >= len(inputs), "fail_insufficient_prompts"
    for i, value in enumerate(inputs):
        if value < 0:
            assert msg_patterns.get_msg("negative", lang).search(lines[i]), "fail_negative"

if __name__ == "__main__":
    files, lang = core.parse_command()
    st_lib = c.load_library(files[0], so_name, lang, req_stdio=True)
    if st_lib:
        c.test_c_function(st_lib, st_prompt_function, [[]] * 10, ref_prompt, lang,
            inputs=gen_input_vector,
            ref_needs_inputs=True,
            message_validator=error_msg_validator
        )
        c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang)

Custom Validators¶

Default validators are adequate for tests where exact matching between reference and student result is reasonable, and informative enough. However when the correct result has multiple potential representations or it is simply very complex, custom validators might be needed. Likewise if the task itself is complex, a custom validator with more than one assert can give better information about how exactly the student's submission is wrong.

Validators are functions that use one or more assert statements to compare the student's result or parsed output (or both) against the reference. The first failed assert is reported as the test result. If the validators runs through without failed asserts, the validation is considered successful. Each assert can be connected to a message that's been defined in the custom messages provided by the checker. To do so, the assert statement should use a handle string which corresponds to a message key in the dictionary.

To demonstrate one use case for custom validators, we'll modify the even/odd printing function checker a bit. We add a feature that the student submission must also return the number of even/odd values in the given list of numbers. Therefore we now have to validate both the return value of the student function and the values parsed from its output. Both of these are already obtained by the checker, but we need to modify the reference to provide both as well.

def ref_func(numbers, n, even):
    if even:
        printed = [n for n in numbers if n % 2 == 0]
        return len(printed), printed
    else:
        printed = [n for n in numbers if n % 2 != 0]
        return len(printed), printed

Now we can impement a validator that checks both the return value and the parsed values. The strings at the end of each assert correspong to messages defined in the custom messages dictionary.

msgs = core.TranslationDict()
msgs.set_msg("fail_return_value", "fi", "Funktion paluuarvo oli väärä.")
msgs.set_msg("fail_return_value", "en", "The function's return value was incorrect.")
msgs.set_msg("fail_values", "fi", "Funktion tulostamat arvot olivat väärin.")
msgs.set_msg("fail_values", "en", "Values printed by the function were incorrect.")

def validate_both(ref, res, parsed):
    assert res == ref[0], "fail_return_value"
    assert tuple(parsed) == tuple(ref[1]), "fail_values"

Calling test_c_function with these modifications (and a presenter which is shown in the full example):

c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang,
    custom_msgs=msgs,
    presenter=presenters,
    output_parser=parse_ints,
    validator=validate_both
)

Full example:

llguide-even-odd-rv-c-py

import random
import re
import test_core as core
import c_extension as c
from cffi import FFI

ffi = FFI()

int_pat = re.compile("-?[0-9]+")

st_function = {
    "fi": "tulosta_parilliset_parittomat",
    "en": "print_even_odd"
}

so_name = {
    "fi": "parilliset_parittomat",
    "en": "even_odd"
}

msgs = core.TranslationDict()
msgs.set_msg("CorrectResult", "fi", "Funktio tulosti ja palautti oikeat arvot.")
msgs.set_msg("CorrectResult", "en", "Your function printed and returned the correct values.")
msgs.set_msg("PrintStudentResult", "fi", "Funktion paluuarvo: {res}\nFunktion tulosteesta parsittiin arvot: {parsed}")
msgs.set_msg("PrintStudentResult", "en", "Return value: {res}\nValues parsed from function output: {parsed}")
msgs.set_msg("PrintReference", "fi", "Oikea paluuarvo: {ref[0]}\nOikeat tulosteet: {ref[1]}")
msgs.set_msg("PrintReference", "en", "Correct return value: {ref[0]}\nCorrect prints: {ref[1]}")
msgs.set_msg("fail_return_value", "fi", "Funktion paluuarvo oli väärä.")
msgs.set_msg("fail_return_value", "en", "The function's return value was incorrect.")
msgs.set_msg("fail_values", "fi", "Funktion tulostamat arvot olivat väärin.")
msgs.set_msg("fail_values", "en", "Values printed by the function were incorrect.")

def gen_vector():
    v = []
    v.append(([2, 3, 5, 8, 10], 5, True))
    v.append(([4, 3, 7, 9 ,11], 5, False))
    
    for i in range(8):
        numbers = [random.randint(-100, 100) for j in range(random.randint(5, 10))]
        v.append((numbers, len(numbers), random.randint(0, 1)))
        
    random.shuffle(v)
        
    return v

def ref_func(numbers, n, even):
    if even:
        printed = [n for n in numbers if n % 2 == 0]
        return len(printed), printed
    else:
        printed = [n for n in numbers if n % 2 != 0]
        return len(printed), printed

def parse_ints(output):
    return [int(v) for v in int_pat.findall(output)]

def validate_both(ref, res, parsed):
    assert res == ref[0], "fail_return_value"
    assert tuple(parsed) == tuple(ref[1]), "fail_values"

def ref_presenter(value):
    return value[0], " ".join(str(x) for x in value[1])    

presenters = {
    "ref": ref_presenter
}

if __name__ == "__main__":
    files, lang = core.parse_command()
    st_lib = c.load_library(files[0], so_name, lang, req_stdio=True)
    if st_lib:
        c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang,
            custom_msgs=msgs,
            presenter=presenters,
            output_parser=parse_ints,
            validator=validate_both,
        )

Other Customization¶

Most customization was covered by the previous sections. There's two more keyword arguments to test_c_function that were not covered: new_test and repeat. The first is for miscellaneous preparations at the start of each test case and the latter is a mechanism for calling the student function multiple times with the same arguments and inputs instead of once. Both of these are mostly for enabling fringe checkers.

If you provide a function as the new_test callback, this function will be called as the very first thing in each test case. In most checkers it's not needed for anything at all (the default does nothing). However, if there are persistent objects involved in the checking process that are not re-initialized by the student code, this callback is the correct place to reset them. The callback receives two arguments: test case arguments and inputs. Other objects that you need to access have to be accessed globally within the checker code.

Repeat is needed even more rarely. It only has use in testing functions that would normally be called within a loop mutliple times to achieve the desired result.

Output Customization¶

An important aspect of checkers is that their output should accurately represent what is going on in the test. Misleading messages are worse than no messages at all. PySenpai's default set of messages is adequate for most normal situations, but when checkers do something slightly different, altering the messages appropriately is called for. As discussed earlier, checkers can also add their own messages for validation and diagnosis features. These should be more helpful than "Your function returned incorrect value(s)".

Another aspect of output is the representation of code and data. Again PySenpai provides reasonable defaults that represent things like simple function calls, simple values and even simple structures quite well. However, when checkers involve more complex code or structures, these default represntations can become unwieldy. In these situations, checkers should provide presenter overrides - functions that format values more pleasantly for the evaluation report. C in particular is more difficult to present adequately with default presenters - especially when working with pointers.

When Lovelace processes the evaluation log from PySenpai, it uses the Lovelace markup parser to render everything. This means messages can - and usually should - include markup to make the feedback easier to read. PySenpai's default messages do use Lovelace markup. Any message can also be accompanied with a list of hints and/or triggers by using a dictionary instead of string as the message value. So the message value can be either a string, or a dictionary with the following keys:

content (string)
hints (list of strings)
triggers (list of strings)

If you want to add hints or triggers to a default message, simply omit the content key from your dictionary. For new messages content must always be provided.

Overriding Default Messages¶

Overriding in PySenpai works by calling the update method of the default message dictionary with the custom_msgs parameter as its argument. The update method of TranslationDict is slightly modified from a normal dictionary update. On the operational level, all messages in PySenpai are stored as dictionaries. However, when overriding or adding messages, the message can be provided as a string or a dictionary. When overriding an existing message with a string, the string replaces the value of the "content" key in the dictionary.

To create an override, simply create a message in your custom message dictionary with the same key as the one you want to replace. You can find the specifics of each default message from the -- WARNING: BROKEN LINK --PySenpai Message Reference including what keyword arguments are available for format strings. If you use a dictionary without the "content" key as your override, the default message will be unchanged but any hints or triggers will be included in the evaluation log if the message is triggered.

Adding Messages¶

Messages need to be added when a checker wants to report something that PySenpai doesn't report by default. Custom validators that have more than one assert typically have separate message keys for each assert as was shown earlier. Messages corresponding to these keys must be found in the message dictionary. Most of these messages must be plain statements - they are not offered any formatting arguments. The sole exceptions are messages tied to information functions (introduced later), which get the function's return value as a formatting argument. Adding them is very straightforward since the checker developer is in charge of choosing the message keys.

In addition to completely custom messages, you can also add messages for exceptions that might occur when loading the student code as a library or calling the student function. However with C this is somewhat less relevant because obviously you don't get Python exceptions when something goes wrong in the C code. When loading the student code, PySenpai has default messages for the following exceptions:

ImportError
EOFError
SystemExit
SyntaxError
IndentationError
NameError
OSError

When calling the student function, the following exceptions have default messages:

TypeError
AttributeError
EOFError
SystemExit

To add a message for an exception, simply use the exception's name (with CamelCase) as the message key. When messages are printed into the evaluation log, they are given two format arguments: ename and emsg. Both are obtained from Python's stack trace. For legacy reasons the argument and input lists are also given (as args and inputs) when formatting but should not be used - both are also printed separately when an exception occurs using the PrintTestVector and PrintInputVector messages.

Overriding Presenters¶

Presenters are functions that prepare various test-related data to be shown in the evaluation log. Function tests support 6 presenters, although they only use 5 of those by default. The presenters are for:

Arguments
Function call
Inputs
Student result
Reference result
Values parsed from output

Out of these arguments is not used by default because the arguments are shown in the function call representation. The latter is more useful because it shows exactly how the student's function was called, and they can copy the line into their own code for trying it out themselves. The default call presenter splits long function calls into multiple lines, and always encapsulates the call into a syntax-highlighted code block markup. The default presenter for data values mainly uses repr and cleans braces from lists, tuples and dictionaries.

The biggest limiting factor with the defaults for C checkers is that they can't really show pointers in any meaningful way because pointers don't exist in Python. Because of this custom presenters are more commonly required for C checkers. The use of pointers and suggestions for how to present them in the evaluation log are provided in the next section.

Presenter overrides are given to test_c_function as a dictionary. You only need to provide key/value pairs for presenters you want to override. Below is the dictionary that contains the defaults. Just copy it, put in your replacements and cut the other lines. When calling test_function, this dictionary should be given as the presenter keyword argument.

default_presenters = {
    "arg": default_c_value_presenter,
    "call": default_c_call_presenter,
    "input": default_input_presenter, 
    "ref": default_c_value_presenter,
    "res": default_c_value_presenter,
    "parsed": default_value_presenter

Presenters themselves are relatively straightforward functions: they receive a single argument, and should return a single representation. Usually this should be a fully formatted string. However if you also override the message, you can return a structure and do rest of the formatting in the message format string instead. Presenters are expected to handle their own exceptions. This is especially important for presenters that format student results. A good practice is to use a blanket exception and default to return repr(value).

Most common case for custom presenters are values where repr isn't sufficient to give a nice representation (e.g. two-dimensional lists) or a sensible representation at all (e.g. pointers, structs, files). Likewise the default call presenter has trouble when given long lists as arguments. The optimal representation of data structures is often dependent on the exact task at hand. For exercises where files are written, the following example can be used to show the file contents (in a separate box using a monospace font) when given a filename.

def file_presenter(value):
    try:
        with open(value) as source:
            return "{{{\n" + source.read() + "\n}}}"
    except:
        return core.default_value_presenter(value)

See the next section for examples involving pointers and data structures.

Working with Pointers and C Structures¶

Using Pointers and Arrays¶

Checking functions that modify existing values instead of returning anything can be done using a result object extractor. These are functions that modify the student result. The most oftenly used mofication is to replace the return value with one of the arguments from the test vector - the one that contains the pointer to the value that was modified. When doing tests with pointers, it should be noted that - by default - the same pointer is passed to the reference and the student function, and other possible functions. Since this will lead to problems, we need another modification to the default behavior: an argument cloner. A cloner creates copies of mutable objects in the argument vector to prevent functions from affecting each other.

As stated earlier, Python lists that contain numbers will be automatically converted to arrays when passed to the C code. If modified from either side, the changes will be reflected. Meanwhile, pointers are created through CFFI. To create a pointer, use the new-method of ffi-object (which is initiated at the start of the checker skeleton precisily for this purpose). For example, let's create a pointer to a double:

ptr = ffi.new("double*")

This will create a CData object which converts into a pointer when passed to C code. So, for example, let's say we want to change our kinetic energy calculator so that instead of returning the result, it writes it into memory using a pointer. We're going to need to generate the result variables for the test vector:

def gen_vector():
    v = []
    for i in range(10):
        v.append((round(random.random() * 100, 2), round(random.random() * 100, 2), ffi.new("double*")))
    
    return v

The next problem is getting this value to the validator. If we view a CData value in the Python console, it looks like this <cdata 'double *' owning 8 bytes> which is not terribly useful. To get to the actual value, we need to use index 0 as if the CData variable was a list. This makes sense when we consider the relationship of arrays and pointers in C. Using this information, we can implement a result object extractor:

def result_from_pointer(args, res, parsed):
    return args[2][0]

While an argument cloner is actually not needed in this checker as we don't initialize the variable with a value. However for the sake of completion it will be shown here. There is no direct way to copy CData objects, the only way seems to be to manually create new ones and copy the values. Like this:

def pointer_cloner(args):
    clone = ffi.new("double*")
    clone[0] = args[2][0]
    return args[0], args[1], clone

With these pieces in place the checker would already work. However if CData is printed into the evaluation log, it isn't very useful as it simply shows "<cdata 'double *' owning 8 bytes>". Presenter for the function call that shows something more useful instead is called for. The best way is to show the definition line for the variable where we want to store result as this creates a representation that students can copy into their main function.

res_var = {
    "fi": "tulos",
    "en": "result"
}

def call_pointer_presenter(fname, args):
    res_var_def = "double " + res_var[lang] + ";\n"
    args[2] = "&" + res_var[lang]
    call = fname + "("
    call += ", ".join(str(arg) for arg in args)
    call += ");"    
    return "{{{highlight=c\n" + res_var_def + call + "\n}}}"
                
presenters = {
    "call": call_pointer_presenter,
}

And with all the pieces in place, the call to test_c_function looks like:

c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang,
    argument_cloner=pointer_cloner,
    result_object_extractor=result_from_pointer,
    presenter=presenters
)

And full example:

kinetic_pointer_test.py

import random
import test_core as core
import c_extension as c
from cffi import FFI

ffi = FFI()

st_function = {
    "fi": "laske_kineettinen_energia",
    "en": "calculate_kinetic_energy"
}

so_name = {
    "fi": "energia_ptr",
    "en": "energy_ptr"
}

res_var = {
    "fi": "tulos",
    "en": "result"
}

msgs = core.TranslationDict()

def gen_vector():
    v = []
    for i in range(10):
        v.append([round(random.random() * 100, 2), round(random.random() * 100, 2), ffi.new("double*")])
    
    return v

def ref_func(v, m, res):
    return 0.5 * m * v * v

def result_from_pointer(args, res, parsed):
    return args[2][0]

def pointer_cloner(args):
    clone = ffi.new("double*")
    clone[0] = args[2][0]
    return [args[0], args[1], clone]

def call_pointer_presenter(fname, args):
    res_var_def = "double " + res_var[lang] + ";\n"
    args[2] = "&" + res_var[lang]
    call = fname + "("
    call += ", ".join(str(arg) for arg in args)
    call += ");"    
    return "{{{highlight=c\n" + res_var_def + call + "\n}}}"
        
presenters = {
    "call": call_pointer_presenter,
}

if __name__ == "__main__":
    files, lang = core.parse_command()
    st_lib = c.load_library(files[0], so_name, lang)
    if st_lib:
        c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang,
            argument_cloner=pointer_cloner,
            result_object_extractor=result_from_pointer,
            presenter=presenters
        )

Using Strings¶

Giving strings as function arguments is relatively simple with CFFI. Because Python uses unicode strings and C uses character arrays, some conversion is required. If the string is only read by the function it's sufficient to convert the Python string into a byte string - CFFI will automatically convert byte strings to character arrays when calling C functions. If the string in question is a literal value, the conversion to byte string can be done with the bytes prefix, i.e.:

c_string = b"this is a string"

If the string is not a literal value (often the case in test vectors) it can be converted using the encode method. To convert a string of 10 random ASCII letters into a byte string:

c_string = "".join(random.choice(string.ascii_letters) for i in range(10)).encode("utf-8")

However, if the C function mutates the string, a CData object must be created because Python strings cannot be written into. Similarly to the pointer example above, this is created by using ffi.new. Instead of a pointer we just create a character array. If the string has content, it can be created without defining length just like in C:

c_string = ffi.new("char []", b"this is a string")

If you want to define an empty array for storing a result, it is also done similarly to C. Also, like in C, remember to reserve space for the null terminator character. So in order to create a result string for writing 4 characters:

c_string = ffi.new("char [5]")

When moving back to Python, it's best to convert back to unicode strings. This is done with the decode method in both cases. However if the string was created as a CData object, it also needs to be converted to a byte string first:

py_string = ffi.string(char_array).decode("utf-8")

If it was passed as a byte string it can be decoded directly instead:

py_string = c_string.decode("utf-8")

Putting this information to use in a checker example, let's make a checker for a function that emulates the Python title string method, i.e. converts the first character of every word in a string into uppercase (and the rest into lowercase, but we're skipping this part). In this scenario, since the C function writes into a string, we need both approaches presented above. The test vector generates the original string and also creates the target for writing, and includes length.

def gen_vector():
    v = []
    for i in range(10):
        case = ""
        for j in range(random.randint(2, 4)):
            for k in range(random.randint(2, 8)):                
                case += random.choice(string.ascii_lowercase)            
            case += " " 
        v.append([case[:-1].encode("utf-8"), ffi.new("char[{}]".format(len(case))), len(case) - 1])
        
    return v

The reference does what the student submission is expected to do, without needing to touch either the result string or the length:

def ref_func(orig, res, n):
    return orig.decode("utf-8").title()

Just like in the previous example, a result object extractor is needed to get access to the string from the argument list. This shows the full conversion from a CData object back to unicode string.

def res_string_extractor(args, res, parsed):
    return ffi.string(args[1]).decode("utf-8")

A function call presenter is also needed, but since it's quite similar to the one used in the previous example it's only shown in the full example. Calling test_c_function is also similar, but shown here for completion anyway:

c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang, 
    result_object_extractor=res_string_extractor,
    presenter=presenters
)

title_test.py

import random
import string
import test_core as core
import c_extension as c
from cffi import FFI

ffi = FFI()

st_function = {
    "fi": "otsikoksi",
    "en": "to_title"
}

so_name = {
    "fi": "otsikko",
    "en": "title"
}

orig_str = {
    "fi": "alkup",
    "en": "original"
}

res_str = {
    "fi": "tulos",
    "en": "result"
}

msgs = core.TranslationDict()

def gen_vector():
    v = []
    for i in range(10):
        case = ""
        for j in range(random.randint(2, 4)):
            for k in range(random.randint(2, 8)):                
                case += random.choice(string.ascii_lowercase)            
            case += " "
        v.append([case[:-1].encode("utf-8"), ffi.new("char[{}]".format(len(case))), len(case) - 1])
        
    return v

def ref_func(orig, res, n):
    return orig.decode("utf-8").title()

def res_string_extractor(args, res, parsed):
    return ffi.string(args[1]).decode("utf-8")

def call_string_presenter(fname, args):
    orig_str_def = "char {}[] = {};\n".format(orig_str[lang], args[0])
    res_str_def = "char {}[{}];\n".format(res_str[lang], args[2] + 1)
    call = fname + "({}, {}, {});".format(orig_str[lang], res_str[lang], args[2])
    return "{{{highlight=c\n" + orig_str_def + res_str_def + call + "\n}}}"

presenters = {
    "call": call_string_presenter
}

if __name__ == "__main__":
    files, lang = core.parse_command()
    st_lib = c.load_library(files[0], so_name, lang, req_stdio=True)
    if st_lib:
        c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang, 
            result_object_extractor=res_string_extractor,
            presenter=presenters
        )

Using Structs¶

Structs are the most complicated objects to involve in C checkers. Fortunately even they aren't particularly complicated. Like pointers and mutable strings, structs are created with ffi.new. While on the Python side, structs work just like Python objects (i.e. class instances). However, when setting attributes to them all of the previous cautions must be taken into consideration. In other words, numbers, single characters and lists of numbers are converted easily enough.

Before a struct can be created, CFFI needs to know its type definition. The chosen approach in PySenpai is to provide the struct definition by the checker. Type definitions are provided to load_library via its typedefs keyword argument. This argument takes a dictionary of language code keys, where each key matches a list of strings where the strings are valid C type definitions (in this case structs). Of course if the student code's definition for the struct is different, there will be trouble.

The example is from a checker in the Computer Systems course. It defines a simpler structure, point, which is a coordinate pair, and a more complex structure that contains these point structures. The reason for the slightly complicated example is to show that creating structs within structs is a bit weird. As stated, the first order of business is to create the type definitions for structs. This is done in two languages below:

typedefs = {
    "en": [
        "struct point {int x; int y;};",
        "struct rect {struct point max; struct point min; struct point all_points[10];};"
        ],
    "fi": [
        "struct piste {int x; int y;};",
        "struct laatikko {struct piste max; struct piste min; struct piste pisteet[10];};"
    ]
}

The task in this case is to create a function that finds the furthest and closest points (to origin, i.e. longest and shortest vector) and then writes those points into the max and min properties of the struct. So, if we wanted to create a point structure with values, we would do something like this (shown in English only for simplicity):

point = ffi.new("struct point *")
point.x = 5
point.y = 5

This is relatively straightforward as you can see. However, if we want to put an array of these point structs inside the rect struct some weird stuff is going on. The logical way to do this would be to create a bunch of point structs and put them in a list, then attach that list to the rect struct. Except this does not work, and the solution that works is actually simpler. The list of points can actually be defined as a list of tuples on the Python side, and it gets converted correctly when entering the C side. So the test vector is created like this:

    v = []
    for i in range(5):
        ts = ffi.new("struct {} *".format(typenames.get_msg("rect", lang)))
        points = [(random.randint(0, 40), random.randint(0, 40)) for i in range(10)] 
        setattr(ts, typenames.get_msg("pointlist", lang), points)
        v.append([ts])            
    return v

Of course it does make sense if you consider how structs work in C - it's just a continuous area of memory, so CFFI can just write the contents of the Python list there. The typenames referenced here is a TranslationDict. It's needed because the structs have different names and property names. It looks like this:

typenames = core.TranslationDict()
typenames.set_msg("rect", "fi", "laatikko")
typenames.set_msg("rect", "en", "rect")
typenames.set_msg("point", "fi", "piste")
typenames.set_msg("point", "en", "point")
typenames.set_msg("pointlist", "fi", "pisteet")
typenames.set_msg("pointlist", "en", "all_points")

The reference function shows that on the Python side this struct acts just like an object. We can use getattr to grab the list of points and find the minimum and maximum. The chosen method (sorting by length) requires us to create a copy of the list.

def ref_func(ts):    
    points = list(getattr(ts, typenames.get_msg("pointlist", lang)))
    points.sort(key=lambda p: (p.x ** 2 + p.y ** 2) ** 0.5)
    return points[0], points[-1]

To get the min and max points from the structure to the validator, a result object extractor is used:

def maxmin_extractor(args, res, out):
    return args[0].min, args[0].max

Even though the reference and this function now return similar values, they are not the same so the default validator does not work. A custom validator that compares the numerical values contained within the points is needed instead.

def point_validator(ref, res, out):
    assert ref[0].x == res[0].x and ref[0].y == res[0].y, "fail_minpoint"
    assert ref[1].x == res[1].x and ref[1].y == res[1].y, "fail_maxpoint"

This is majority of the pieces required, the rest (presenters) are shown in the full example. Calling load_library and test_c_function with these pieces is shown separately below:

st_lib = c.load_library(files[0], so_name, lang, typedefs=typedefs)
if st_lib:
    c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang,
        custom_msgs=custom_msgs,
        result_object_extractor=maxmin_extractor,
        presenter=presenters,
        validator=point_validator
    )

And the full example:

maxmin_test.py

import re
import os
import random
import sys
import test_core as core
import c_extension as c

ffi = c.ffi

star_pat = re.compile("\*+")

custom_msgs = core.TranslationDict()
custom_msgs.set_msg("PrintStudentResult", "fi", "Funktiosi valitsi minini- ja maksimipisteet seuraavasti:\n{res}")
custom_msgs.set_msg("PrintStudentResult", "en", "Your function selected the following points for min/max:\n{res}")
custom_msgs.set_msg("PrintReference", "fi", "Pisteet jotka olisi pitänyt valita:\n{ref}")
custom_msgs.set_msg("PrintReference", "en", "Points that should have been selected:\n{ref}")
custom_msgs.set_msg("PrintTestVector", "fi", "Funktiolle annettu tietue sisälsi seuraavat pisteet:\n{args}")
custom_msgs.set_msg("PrintTestVector", "en", "The struct given to your function included these points:\n{args}")
custom_msgs.set_msg("fail_minpoint", "fi", "Funktio valitsi minimipisteen väärin.")
custom_msgs.set_msg("fail_minpoint", "en", "The function set the minimum point wrong.")
custom_msgs.set_msg("fail_maxpoint", "fi", "Funktio valitsi maksimipisteen väärin.")
custom_msgs.set_msg("fail_maxpoint", "en", "The function set the maximum point wrong.")

typenames = core.TranslationDict()
typenames.set_msg("rect", "fi", "laatikko")
typenames.set_msg("rect", "en", "rect")
typenames.set_msg("point", "fi", "piste")
typenames.set_msg("point", "en", "point")
typenames.set_msg("pointlist", "fi", "pisteet")
typenames.set_msg("pointlist", "en", "all_points")

st_function = {
    "en": "find_maxmin",
    "fi": "etsi_maxmin"
}

so_name = {
    "en": "maxmin",
    "fi": "maxmin"
}

typedefs = {
    "en": [
        "struct point {int x; int y;};",
        "struct rect {struct point max; struct point min; struct point all_points[10];};"
        ],
    "fi": [
        "struct piste {int x; int y;};",
        "struct laatikko {struct piste max; struct piste min; struct piste pisteet[10];};"
    ]
}

def gen_vector():
    v = []
    for i in range(5):
        ts = ffi.new("struct {} *".format(typenames.get_msg("rect", lang)))
        points = [(random.randint(0, 40), random.randint(0, 40)) for i in range(10)] 
        setattr(ts, typenames.get_msg("pointlist", lang), points)
        v.append([ts])            
    return v

def ref_func(ts):    
    points = list(getattr(ts, typenames.get_msg("pointlist", lang)))
    points.sort(key=lambda p: (p.x ** 2 + p.y ** 2) ** 0.5)
    return points[0], points[-1]        
    
def maxmin_extractor(args, res, out):
    return args[0].min, args[0].max

def point_validator(ref, res, out):
    assert ref[0].x == res[0].x and ref[0].y == res[0].y, "fail_minpoint"
    assert ref[1].x == res[1].x and ref[1].y == res[1].y, "fail_maxpoint"
    
def maxmin_presenter(value):
    return "min: {minp.x}, {minp.y}\nmax: {maxp.x}, {maxp.y}".format(minp=value[0], maxp=value[1])
    
def args_presenter(value):
    return " ".join("({}, {})".format(p.x, p.y) for p in getattr(value[0], typenames.get_msg("pointlist", lang)))
    
presenters = {
    "ref": maxmin_presenter,
    "res": maxmin_presenter,
    "arg": args_presenter
}
    
if __name__ == "__main__":
    files, lang = core.parse_command()
    st_lib = c.load_library(files[0], so_name, lang, typedefs=typedefs)
    if st_lib:
        c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang,
            custom_msgs=custom_msgs,
            result_object_extractor=maxmin_extractor,
            presenter=presenters,
            validator=point_validator
        )

Diagnosis Features¶

Due to the default messages of PySenpai, even by following the basic checker instructions above most checkers give relatively useful feedback. However, to truly reduce TA work, PySenpai offers a few ways to create catches for typical mistakes, and provide additional hints when students make them. Creating these is often an iterative process as year after year you have more data about what kinds of mistakes students make. However, making some initial guesses can be helpful too.

Implementing a diagnostic typically involves creating a function that discovers the mistake, and one or more messages that are shown in the evaluation output when the mistake is encountered. Attaching hints and

highlight triggers

is also common as they can provide much more accurate information when your checker is pretty certain of the nature of the mistake.

There's three ways to implement diagnosis functions: error/false references, custom tests and information functions. There is some overlap, but usually it's pretty straightforward to choose the right one. As stated in the PySenpai overview, there is also a return value recurrence check built into PySenpai. This is enabled by default and should be disabled when testing functions that don't return or modify anything (i.e. functions that only print stuff).

False Reference Functions¶

False or error reference functions are one of the most convenient ways to identify commonly made mistakes by students. A false reference is a modified copy of the actual reference function that emulates a previously identified erroneous behavior. In tests, these functions are treated like reference functions. They are called in the diagnosis step, and the student’s result is compared with the false reference result using the same validator as the test itself. If it matches (i.e. there is no AssertionError), it is highly likely that the student has made the error that’s emulated by the false reference.

False references are usually very easy to implement. Attaching messages to them is also very simple. When PySenpai gets a match with the student result against the false reference, it looks up a message with the false reference function's name. Do note that this message must be provided by the checker - there is no default message. Having a default would not make sense because PySenpai cannot know what your false reference function wants to say. False references are passed to test_c_function as a list of functions, so you can have as many as you want.

Let's assume our students have a hard time remembering to multiply the result of m * v ** 2 by 0.5 when calculating kinetic energy. In this case the false reference would simply be:

def eref_2x_energy(v, m):
    return m * v * v

After creating the function we also need the corresponding message in our custom messages dictionary.

msgs = core.TranslationDict()
msgs.set_msg("eref_2x_energy", "fi", dict(
    content="Funktion palauttama tulos oli 2x liian suuri.",
    hints=["Tarkista kineettisen energian laskukaava."]
)
msgs.set_msg("eref_2x_energy", "en", dict(
    content="The function's return value was 2 times too big.",
    hints=["Check the formula for kinetic energy."]
)

Finally modify test_c_function call to include our new diagnosis function.

c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang
    custom_msgs=msgs,
    error_refs=[eref_2x_energy]
)

Full example, based on the minimal checker we created earlier.

kinetic_test_eref_c.py

import random
import test_core as core
import c_extension as c
from cffi import FFI

ffi = FFI()

st_function = {
    "fi": "laske_kineettinen_energia",
    "en": "calculate_kinetic_energy"
}

so_name = {
    "fi": "energia",
    "en": "energy"
}

msgs = core.TranslationDict()
msgs.set_msg("eref_2x_energy", "fi", dict(
    content="Funktion palauttama tulos oli 2x liian suuri.",
    hints=["Tarkista kineettisen energian laskukaava."]
)
msgs.set_msg("eref_2x_energy", "en", dict(
    content="The function's return value was 2 times too big.",
    hints=["Check the formula for kinetic energy."]
)

def gen_vector():
    v = []
    for i in range(10):
        v.append((round(random.random() * 100, 2), round(random.random() * 100, 2)))
    
    return v

def ref_func(v, m):
    return 0.5 * m * v * v

def eref_2x_energy(v, m):
    return m * v * v

if __name__ == "__main__":
    files, lang = core.parse_command()
    st_lib = c.load_library(files[0], so_name, lang)
    if st_lib:
        c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang
            error_refs=[eref_2x_energy]
        )

Custom Tests¶

Custom tests are additional validator functions, but instead of validating the result, they check the result for known mistakes. For this end they are given more arguments by PySenpai than normal validators. A custom test can make use of arguments, inputs and raw output of the student function in addition to what's available to normal validators (i.e. result, values parsed from output and reference). A custom test can make multiple asserts the same way a validator does. Likewise, each assert can be connected to a different message. If a corresponding message is not found, PySenpai uses the function's name to fetch a message (if this fails it raises a KeyError).

The overlap of custom tests and validators is largely due to historical reasons. In the past validators did not use assert statements - they simply returned True of False. This meant that custom tests were needed whenever a more accurate statement about the problem was called for. With assert statements, validators can do most of the work that was previously done by custom tests. However there are still some valid reasons to use custom tests.

Since custom tests are only run if the initial validation fails, they can include tests that would occasionally trigger with a correctly behaving function. Another advantage is that they give information on top of the validation rejection message (remembering that only the first failed assert is reported). Also if you are otherwise content with the default validator or one of the built-ins, custom tests can provide the additional checking that is not necessary for validating but can be useful for the student to know.

The example shown here is pretty simple. It's from a test that uses a standard validator to check a prompt function that is expected to return an integer that is 2 or bigger. The custom test is added to draw more attention to situations where the student function returns a number that's smaller than 2 as this is something they may have missed in the exercise description.

custom_msgs.set_msg("fail_less_than_two", "fi", dict(
    content="Funktio palautti luvun joka on pienempi kuin kaksi.",
    hints=["Varmista, että kyselyfunktio tarkistaa myös onko luku suurempi kuin 1."]
))
custom_msgs.set_msg("fail_less_than_two", "en", dict(
    content="Your function returned a number that's smaller than two.",
    hints=["Make sure your input function also checks that the given number is greater than 1."]
))

def less_than_two(res, parsed, out, ref, args, inps):
    if isinstance(res, int):
        assert res > 1, "fail_less_than_two"

Information Functions¶

Information functions are in many ways similar to custom tests, but their results are reported differently. Where custom tests report the message for the failed assertion (if any), information functions report a message that contains value(s) returned by the function. The use case for information functions is when you want to show something specific to the student instead of giving a verbal statement about the issue. Information functions receive the same arguments as custom tests. However, unlike custom tests, information functions must return a value.

Information functions need to be accompanied by a message that uses the function's name as its dictionary key. This message is given the return value of the information function as a named key argument func_res. Information functions are not expected to find something every time. For the times they do not find anything worth reporting, they should raise NoAdditionalInfo. This will signal to test_function to not print the associated message at all.

An example use of this is from a checker that tests a function that finds out whether a given number is a prime. As the function return value is simply True or False, the report has no information which divisor the student function may have missed when it gives a false positive. To add this information, we can make an information function that returns the smallest divisor for a non-prime number. If the number is a prime, there is no divisor to show - therefore NoAdditionalInfo is raised.

custom_msgs.set_msg("show_divisor", "fi", "Luku on jaollinen (ainakin) luvulla {func_res}")
custom_msgs.set_msg("show_divisor", "en", "The number's (first) factor is {func_res}")

def show_divisor(res, parsed, output, ref, args, inputs):
    for i in range(2, int(args[0] ** 0.5) + 1):
        if args[0] % i == 0:
            return i
    raise core.NoAdditionalInfo

Advanced Tips¶

This section covers miscellaneuous tricks that we have used to implement fringe case checkers in the past. Some are small hacks while others required a large amount of work. Most of these examples may not be directly useful, but they should give you some ideas how to proceed when there is no clear path as to how to implement a checker.

Using Single Precision Floats¶

One limitation of Python for the purpose of interacting with C code is that all floating point numbers in Python are double precision. This means that when reference functions are implemented in Python, the results returned them will not match those of the C implementation if single precision floats are used in the exercise. In order to reduce a Python float to a single precision C float, the cast method of CFFI can be used:

c_float = ffi.cast("float", python_float)

The only problem is that the value produced by this method cannot be used as a number in Python calculations. However it can be converted back to a Python float retaining its reduced precision.

python_float = float(ffi.cast("float", python_float))

Depending on the circumstances this may be needed multiple times in the reference when doing calculations. Also, a rounding validator is recommended to be used with this as usual because different ways of implementing calculations can still create differences in rounding.

Accessing Global Variables¶

As stated earlier, the normal way of loading a library by precompiling the library file doesn't allow access to global variables within the library. In order to gain access to them, load_with_verify must be used. The upside of using this is that there is no need for a separate compilation step when creating the Lovelace exercise. The downside is that if the compilation fails the student won't get any useful information from the checker. The only thing reported is the CompileError message (see message reference).

Just like with everything else, when using global variables CFFI needs to be told what they are. The chosen approach is the same as with structs: global variable definitions must be provided by the checker. They go into the same typedefs keyname argument to load_with_verify. This argument takes a dictionary with language codes as keys and lists of strings as values - each string in the list should define one variable.

Time to modify a familiar example. We'll take the kinetic energy calculator and have it write its result to a global variable. The checker is then modified to read the result from there. Since we're using load_with_verify we can do away with the compiling step and don't need to define a library name in the checker. What we do need are dictionaries that contain the name of the result variable, and the definition that allows us to expose it through CFFI. If you try to do this in a checker that uses load_library, this variable will never change its value.

result_var = {
    "fi": "tulos",
    "en": "result"
}

global_defs = {
    "fi": [
        "double tulos;"
    ],
    "en": [
        "double result;"
    ]
}

No changes are needed in the test vector or reference. To obtain the result from the global variable, a result object extractor is used. This time it doesn't get the result from any of its arguments though, instead it just accesses the globally readable student library object. This is what we need the variable name dictionary for.

def result_extractor(args, res, parsed):
    return getattr(st_lib, result_var[lang])

That's actually all the pieces for this simple example. All that's left is to call the two functions that are needed to make this work.

st_lib = c.load_with_verify(files[0], lang, typedefs=global_defs)
if st_lib:
    c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang,
        result_object_extractor=result_extractor
    )

And the full example is shown below:

kinetic_test_global.py

import random
import test_core as core
import c_extension as c
from cffi import FFI

ffi = FFI()

st_function = {
    "fi": "laske_kineettinen_energia",
    "en": "calculate_kinetic_energy"
}

msgs = core.TranslationDict()

result_var = {
    "fi": "tulos",
    "en": "result"
}

global_defs = {
    "fi": [
        "double tulos;"
    ],
    "en": [
        "double result;"
    ]
}

def gen_vector():
    v = []
    for i in range(10):
        v.append((round(random.random() * 100, 2), round(random.random() * 100, 2)))
    
    return v

def ref_func(v, m):
    return 0.5 * m * v * v

def result_extractor(args, res, parsed):
    return getattr(st_lib, result_var[lang])

if __name__ == "__main__":
    files, lang = core.parse_command()
    st_lib = c.load_with_verify(files[0], lang, typedefs=global_defs)
    if st_lib:
        c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang,
            result_object_extractor=result_extractor
        )

Writing to Global Variables¶

Writing to global variables is a similar process, and doesn't need any extra steps. You can set the value just like you would set a value for an attribute of a Python object, i.e. either object.value = 1 or with setattr - the latter is more commonly used for multilingual checkers. The proper place to do this is the new_test callback that can be given to test_c_function as a keyword argument. This callback receives two arguments: the list of arguments and the list of inputs. If inputs are not used in the exercise, you can use the inputs list to smuggle values to new_test. Otherwise you need to make a global list and pop values from there.

This example shows the latter method because it requires a bit more Python know-how. Plese note that this is not really an intended way of using PySenpai, so there's a lot of complications involved. If possible, (ab)using the input vector is far better.

Time to modify the kinetic energy calculator once more. This time it does everything with global variables, so it has no arguments - a really dumb design, but adequate for our demonstration here. We still need to generate the values that will be written somehow. Our test vector generator can still do that, but instead of returning them in a list, it writes them into a global list inside the checker. Instead of arguments, it just returns an empty list for each test case (remember that the number of test cases is defined by the length of the test vector).

In order to navigate this mess, some global variables are needed. The first one keeps the arguments for the actual testing while the second one keeps them for the reference results (needed because all reference results are generated before the student function is called even once). The last variable is a temporary holder for the values used in the currently ongoing test case.

stored_args = []
cloned_args = []
current = None

Generating values to these lists can be done while we generate ten empty lists into the test vector

def gen_vector():
    v = []
    for i in range(10):        
        stored_args.append((round(random.random() * 100, 2), round(random.random() * 100, 2)))
        v.append([])
    
    cloned_args.extend(stored_args[:])
    return v

For the first time we need to actually change the reference. This time it has no arguments to read from so we need to read from cloned_args manually.

def ref_func():
    m, v = cloned_args.pop()
    return 0.5 * m * v * v

For the student function the same values need to be written into the global variables in the C library. As stated earlier, this should be done in the new_test callback. The following function is able to do just that

def write_globals(args, inputs):
    global current
    current = stored_args.pop()
    m, v = current
    st_lib.m = m
    st_lib.v = v

If assigning to a global variable, it must be first declared global within the function. Otherwise Python will create a local variable with the same name. This is one of the reasons we don't usually muck about with global variables in Python. The reason we're taking the values out of the list here and store them in a temporary variable is that we want to be sure they're removed despite of what happens afterwards.

Because this way of providing data to the student function is not supported by PySenpai, it does not know how to present it either. In order to show the student what values we're using in tests, we need a custom call presenter. It shows the assignments to global variables along with the function call.

def set_globals_presenter(fname, args):
    m, v = current
    set_m = "m = {};\n".format(m)
    set_v = "v = {};\n".format(v)
    call = fname + "();"
    return "{{{highlight=c\n" + set_m + set_v + call + "\n}}}"

presenters = {
    "call": set_globals_presenter
}

At this point we're done with the pieces. Time to call some functions. New global variables for m and v were also added to the global_defs dictionary (shown in the full example below).

st_lib = c.load_with_verify(files[0], lang, typedefs=global_defs)
if st_lib:
    c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang,
        result_object_extractor=result_extractor,
        presenter=presenters,
        new_test=write_globals            
    )

Full example:

kinetic_test_full_global.py

import random
import test_core as core
import c_extension as c
from cffi import FFI

ffi = FFI()

st_function = {
    "fi": "laske_kineettinen_energia",
    "en": "calculate_kinetic_energy"
}

msgs = core.TranslationDict()

result_var = {
    "fi": "tulos",
    "en": "result"
}

global_defs = {
    "fi": [
        "double tulos, v, m;",
    ],
    "en": [
        "double result, v, m;"
    ]
}

stored_args = []
cloned_args = []
current = None

def gen_vector():
    v = []
    for i in range(10):        
        stored_args.append((round(random.random() * 100, 2), round(random.random() * 100, 2)))
        v.append([])
    
    cloned_args.extend(stored_args[:])
    return v

def ref_func():
    m, v = cloned_args.pop()
    return 0.5 * m * v * v

def result_extractor(args, res, parsed):
    return getattr(st_lib, result_var[lang])

def write_globals(args, inputs):
    global current
    current = stored_args.pop()
    m, v = current
    st_lib.m = m
    st_lib.v = v

def set_globals_presenter(fname, args):
    m, v = current
    set_m = "m = {};\n".format(m)
    set_v = "v = {};\n".format(v)
    call = fname + "();"
    return "{{{highlight=c\n" + set_m + set_v + call + "\n}}}"

presenters = {
    "call": set_globals_presenter
}

if __name__ == "__main__":
    files, lang = core.parse_command()
    st_lib = c.load_with_verify(files[0], lang, typedefs=global_defs)
    if st_lib:
        c.test_c_function(st_lib, st_function, gen_vector, ref_func, lang,
            result_object_extractor=result_extractor,
            presenter=presenters,
            new_test=write_globals            
        )

Anna palautetta

Kommentteja materiaalista?