Robust Testing of PySheetGrader and Its Assignments

Up until now, there was no concrete way to perform push-button testing of the full behavior of PySheetGrader and PysheetGrader assignments. Manual testing is cumbersome, time-consuming, error-prone, and not repeatable. The latest release fills this gap by adding a testing mode to PySheetGrader as a way to specify cell-level unit tests in the CheckOrder tabs, and a special grading report format to display the test-mode test results. These unit tests are only run in the test mode and are different from the test cases specified in a cell’s rubric note using the test_cases property. The test_cases property is for grading a student’s assignment based on instructor-specified unit tests to decide on a cell’s correctness, whereas test-mode tests are used for testing the behavior of the grader itself before new releases as well as testing new assignments before deploying them to students.

Running PySheetGrader in test mode

The test mode is enabled by invoking PySheetGrader with the --test-mode or -T options from command line. The key file containing the unit test cases and the submission file containing the data for the test cases are specified as follows, in the usual way:

pysheetgrader TestGraderKey.xlsx TestGraderSub.xlsx --test-mode \ 
 --html-report-output Report.html

Here TestGraderKey.xlsx is the test key file containing the test-mode test cases and the TestGraderSub.xlsx is the test submission file containing the data. The --html-report-output option asks PySheetGrader to generate a detailed HTML report, which will be stored in the file Report.html.

Specifying test-mode test cases in CheckOrder tab

In the test mode, cell-level test cases are specified in a CheckOrder tab using additional, special columns with headers test-name, expected-result, and failure-message (this formal will be familiar to the reader who has experience with unit testing in software development). These columns are ignored outside the test mode, during regular grading. The column headers are order-independent and case insensitive. 

Test-mode column purposeColumn headingContent/Value
Test nametest-nametext
Expected resultexpected-resultcorrect or incorrect
Failure message to be added to the report in case of test failurefailure-messagetext

A CheckOrder tab of a test key file containing test-mode unit tests may look like this, with the addition of the extra columns used in test mode:

Note that column headers are case-insensitive. Therefore, for example, Expected-Result is valid as the column header for the expected result column.

The expected result column indicates whether the test expects the evaluation of that cell to be correct or incorrect. The PASS/FAIL status of the test in the produced report will consider what the expected result is to produce a verdict. This allows both correct and incorrect cell solutions in the test submission file to be tested. If no value is specified in this column, PySheetGrader assumes the default value correct. For example, the test named T4 expects the grader to produce a zero score (indicating an incorrect solution) with the given test submission file if the cell A5 is positively graded or a negative score if the cell A5 is negatively graded.

The test-mode grading report

The grading report generated using the test mode has extra columns that indicate the status of each test-mode test case. Failures produce red rows with a FAIL verdict under the Status column. A failure occurs when the test submission file contains a solution that produces an unexpected result for the tested cell. The failure message, if specified, is appended to the status. Passing tests generate green rows with a PASS verdict under the Status column (and no failure message).


We hope that the test mode will allow PySheetGrader itself as well as individual PySheetGrader assignments to be tested more throughly and in a repeatable manner, increasing the confidence of PySheetGrader’s contributors who maintain and extend it as well as instructors who design assignments.

Related Posts

Automatic Grading of Spreadsheet Assignments

Improvements to PySheetGrader

Acknowledgment

Aakash Javia implemented the test mode during the Spring 2022 semester.

Improvements to PySheetGrader

The new release of PySheetGrader includes bug fixes, minor improvements, and several new grading features. In this post, we discuss one of the improvements and all of the new features. The improvement makes the format of the grading instructions in SheetGradingOrder and CheckOrder tabs more flexible by allowing custom columns orders, while retaining backward compatibility. The new grading features include prerequisite cells, killer cells, support for negative (deduction-based) grading, and a variation of relative grading, called check rubric type, that evaluates a formula of the key file on the submission file and compares the result to a cell value stored in the key file.

The check rubric type

The check rubric type is a variation of the relative rubric type. The existing relative rubric type evaluates a formula stored in a cell C of the key file using the submission file’s corresponding cell references and compares the result to the evaluated value of the cell C in the submission file. The check rubric type also employs a similar relative grading strategy, but compares the result of the formula’s evaluation on the submission file to a given cell’s evaluated value in the key file rather than in the submission file.

As an example suppose we want to check that the value of the submission cell B1 is greater than the value of the submission cell B2. In the key file, we can enter the formula =B1 > B2 in a cell A1. We can pick another cell A10 in the key file to specify the expected result of this formula. Suppose we pick A10 for this purpose: we enter TRUE in this cell, which stipulates that B1 > B2 in the submission file. The rubric note on A1 could look like this (in this case, with an alternative cell defined as well):

rubric:
 score: 2
 type: check
 result: A10
alt_cells:
 - F10

Suppose in the submission file, the cell B1 evaluates to 4 and B2 evaluates to 5. Then the score for cell A1 will be 0 for submission because the formula B1 > B2 evaluates to FALSE, but should be TRUE. Note that, unlike in the relative rubric type, the cell A1 of the submission file is not used in this grading strategy.

The expected result can also be stored directly in the cell A1 itself of the key file. We could do this if the formula B1 > B2 in fact evaluates to TRUE in the key file itself. Then the result property can omitted in the rubric note, like this:

rubric:
 score: 2
 type: check
alt_cells:
 - F10

Here is another example using a numeric comparison. This time the key cell A1 contains the formula =B1 + B2 and A10 contains the expected result 8.95. (The result cell cannot contain a formula for now: it needs to be a constant and the grader will not evaluate the formula if a formula is entered in the result cell. We will change this later so the the result cell can also contain a formula.) The rubric note on A1 would also admit the delta property because the result is numeric. It could look like this:

rubric:
 score: 2
 type: check
 delta: .1
 result: A10

A1’s score will be 2 in this variation since the key cell’s formula evaluates to 9 on the submission file and is within a margin of 0.1 of the expected result 8.95.

Negative grading

Negative grading is a common strategy that applies deductions based on each submission mistake. It is used in cases where the student is expected to submit a correct solution for a graded component, and instead of receiving a positive score for the correct solution, the student receives a penalty (a negative score) for an incorrect one. In PysheetGrader, negative grading is implemented by a new grading property whose value is negative. This allows a negative score to be assigned to a graded cell. If the submission cell passes the negative rubric specification, it is assigned a zero score (no deduction). If not, the submission cell is deemed incorrect, and the specified negative cell score is applied as a deduction on the total score. In case of a deduction, a feedback message is added to the grading row of that cell in the report. The maximum score for a negatively graded cell is zero whereas in positive grading, the maximum score is the score property’s value.

To use negative grading, the rubric note must include the grading property:

rubric:
 grading: negative
 ...

If the grading property is omitted, its default value is positive, and the grader uses the normal positive grading behavior. The following rubric note explicitly specifies a normal, positively graded cell (in which case, the grading property can safely be omitted):

rubric:
 grading: positive
 ...

As an example, suppose the key cell A1 contains the formula =A2 + B1 with this rubric note:

rubric:
 grading: negative
 type: formula
 score: -5

If the submission cell A1 contains the formula =A3 + B1, the submission cell will be incorrect (the submission cell’s formula doesn’t match the key cell’s formula), the corresponding grading row in the HTML report will be red (representing a failure, with a corresponding feedback message), and a deduction of -5 will be applied to the submission’s total score. However, if the submission cell A1 contains the formula =B1 + A2, the cell is correct (the key and submission formulas match), and its score will equal zero (the corresponding grading row in the HML report will be green).

Negative grading can be applied to a hidden cell, and works with alternative cells (the alt_cells property).

Prerequisite cells

The new release of PySheetGrader allows the instructor to specify certain cells that must be correct for a particular cell to be graded. These prerequisite cells are specified in the rubric note using a new property prereq, whose format is similar to that of alt_cells. Here is an example:

The above rubric note stipulates that cells B1 and B2 be graded as correct before cell B4 can be graded. If either B1 or B2 fails, the HTML report will look like this (in this example, B2 is correct, but B3 is incorrect):

If the instructor desires, the student can be informed of the prerequisites when they make a mistake in one of the prerequisite cells using the feedback field. In the above example, the instructor explicitly states which prerequisites apply to cell B4 in the feedback given when the cell is graded as incorrect. Recall that when things go well, no feedback is given to the student.

Killer cells

PySheetGrader now allows instructors to specify must-be-correct cells, which, if incorrect, will revert the score of the whole tab (sheet) in which a graded cell is located back to zero. These cells are called killer cells. Killer cells can be specified as hidden if revealing them gives too much information to the student.

An example of a killer cell is shown below in a CheckOrder tab of a key file. Note the inclusion of the special column. (Column headings are now significant: see the section on custom column naming and ordering below.)

Here, cells B5 and B6 are both killer cells, indicated by the letter k under the special column for that cell. If one of these cells is incorrect in the submission file, the HTML report reads as below, and the whole tab receives a zero score. Whenever a killer cell is wrong, the grader stops grading the tab in which the cell is located, and truncates the tab’s report section at that point.

The grader can also handle the case where the instructor does not want to reveal which killer cell is wrong, in which case the special column entry for that cell is specified as hk (or kh). In this case the section of the HTML report for the tab looks like this:

Custom column naming and ordering

The earlier release of PySheetGrader required that the columns of the SheetGradingOrder and CheckOrder tabs be in a specific order. Recall that these special tabs specify the grading instructions for the whole assignment and for its individual parts (graded tabs), respectively. The information specified includes the grading order, cell descriptions, optional feedback messages, hidden cells, and now, the killer cells. Since the columns of the grading instructions were required to be in a fixed order, the column headings did not matter in the previous release: they were ignored. This design decision made adding new grading information and column-based features difficult and cumbersome.

The new release makes the columns specifying grading instructions order-independent, however column headers (the first rows of the SheetGradingOrder tab and the CheckOrder tabs) are now significant and fixed. Columns with headers other than the ones that PySheetGrader recognizes as grading instructions are ignored, and thus can be used for any other purpose, for example, for the instructor’s comments or annotations. Here is an example of how a CheckOrder tab may look like using the new format:

Note that column I is blank and can be used for whatever purpose the instructor deems necessary. The intent is to provide flexibility.

The fixed column headers and contents as follows. All column headings are case-insensitive.

In the SheetGradingOrder tab:

Column purposeColumn headingContent
The name of the tab (sheet) to be gradedname, sheet, or tabtext
Score required for minimum work for the tabmin-worknumeric, zero or positive
Feedback message if minimum work requirement for the tab is not reachedfeedbacktext

In a CheckOrder tab:

Column purposeColumn headingContent
Instructor’s custom identifier for a graded cellnumber, cell-id, or idtext or number
Cell coordinate (reference)cellcorrect alphanumeric Excel cell reference
Description of the graded celldescriptiontext
Cell grading instructions for special cells (for hidden and killer cells)specialany combination of characters h or H (for a hidden cell) and k or K (for a killer cell), without spaces and using the same capitalization:
k, K, h, H, kh, hk, KH, or HK

What comes next?

Robust and repeatable testing is ultimately essential for any auto-grader. The next blog will present the newly released test mode that allows PySheetGrader contributors to test the grading engine before deploying fixes and new features, and instructors to test their assignments before deploying them to students, all in a push-button style. We are also working on the ability to evaluate formulas within Excel to improve the grader’s performance and be able to use any Excel built-in function in formula-based rubrics.

Related posts

Automatic Grading of Spreadsheet Assignments

Testing PySheetGrader and its Assignments

Acknowledgements

Sandeep Keshavgowda implemented the new check rubric type as well as the killer-cell and prerequisite-cells features. Ksheeraja Raghavan implemented negative grading. Aakash Javia implemented the custom column naming and ordering feature.