Processing and uploading scans

At this point we assume that you have printed that papers, fed them to your students (or the other way around), and collected the completed tests.

Scanning papers

To get the students’ work into Plom we have to get the physical papers scanned. The precise details of how you do this will depend on the scanning hardware that is available. We do strongly recommend that you

  • scan in colour,
  • use DPI at least 200,
  • do a trial run of a few papers to check that the scans are easy to read and files are not huge.


  • If you are not doing it yourself, then delegate to someone trustworthy. Scanning is a key task.
  • Staple removal is probably the slowest part of scanning. Spend a little time working out whether you should use scissors, a guillotine, or staple-remover.
  • Use good “paper hygiene”:
    • Sort papers into test-number order before scanning, and keep them in order.
    • Keep papers in sensible and physically manageable bundles, before, during and after scanning.
    • Have a good physical flow of papers from pre-scan to post-scan, so that each paper is scanned exactly once.
    • Give your scan-files clear names like mABCmt1-s1-b2.pdf (mathematics ABC, section 1, bundle 2), rather than the easy ambiguous ones like stuff-3.pdf

Make sure the server is running.

Before we can upload anything the server must be running. Follow these instructions, and make sure that the “scanner” user is created. That user will have authority to upload scans. Once the server is running the “scanner” can upload PDFs so that our marking team can get to work.

Uploading to the server

Notice that the processing and uploading of scans need not be done on the same computer as the original PDF production. Indeed, Plom has been set up so that scanning can be delegated to the “scanner” user who may be running on a different computer entirely.

Make a working directory

First of all we should decide on a working directory for the scanning and upload process. An easy choice to make a subdirectory upload where we built our PDFs. Move into this directory and copy one of the scan PDFs here; we’ll assume we are working on the scans from a single bundle of papers called mABCmt1-s1-b2.pdf. If you run plom-scan by itself, then it will display a simple description of the workflow required and of the sub-commands we need.

Process a PDF into page-images

Since the scan PDF consists of images of many pages, we need to separate the scan PDF into distinct page-images, and do some small amount of post-processing on those pages to darken light-grey pencil. This is done using plom-scan process <filename>. This will result in quite a lot of output (from ghostscript), but you can ignore this. The same command will have also created some subdirectories in which Plom will file away your PDFs and page-images.

Take a quick look in the archivedPDFs directory, and you’ll find your original scan-file and archive.toml. The toml file keeps a simple record of the PDFs you have already uploaded (and their md5sums) to make sure that a given PDF cannot be uploaded twice.

Read qr-codes, sanity check, and filing

Now use plom-scan read to read the qr-codes from the page-images that have just been processed. These qr-codes tell Plom which test-page in each page-image. Plom also uses this as an opportunity to do some simple sanity checks against the test-specification and to determine the orientation of the scanned page. In order to perform some of these checks it needs to contact the server, and consequently you will need to supply the password of “scanner” user.

$ plom-scan read
Please enter the 'scanner' password:
100%|████████████████████████████████████████████████████████| 54/54 [00:02<00:00, 20.12it/s]

At this stage all the page-images fall into two categories “known” and “unknown”. The known pages are ready for upload, while the unknowns need more care are not uploaded by default.

Upload the knowns

The “known”s are page images which contained legible and valid qr-codes which passed various sanity checks. These images are safe to upload to the server:

$ plom-scan upload
Upload images to server
Please enter the 'scanner' password:
Upload 0006,06,2 = t0006p06v2.mt1b1-36.png to server
Upload 0008,03,1 = t0008p03v1.mt1b1-45.png to server
Upload 0005,05,1 = t0005p05v1.mt1b1-29.png to server

You can see that the page-images have been named to tell you their “tpv” (test-page-version) as well as retaining some of their original filename (in this case mt1b1.pdf).

These uploads will succeed unless that particular page-image is already in the database – we call such pages “collisions” and they are a sign that something has probably gone wrong. More on these shortly.

Upload the unknowns

“Unknown”s are not a cause for concern; they typically occur for one of two reasons:

  • the page does not contain qr-codes because it is an extra-page used by a student. We encourage you to not give students completely blank paper since you will need to be able to identify which test and question that extra page belongs to. We have supplied a simple two-sided template for extra pages.

  • the qr-codes on the page were not legible for some reason. Generally this is due to some scanning issue such as a page being folded over, or skewed. Occasionally it is because someone wrote on the qr-code. In our experience this does not happen very often.

Before you upload the unknowns we recommend that you take a look at them using a simple image-browser. The are filed in the unknownPages subdirectory. If you spot a scanning problem then you can find and re-scan the require page.

“Unknowns” will need to be handled manually, and that task is delegated to the “manager” through the plom-manger client. However, before that can happen, the “unknowns” must be uploaded. To do so run plom-scan upload --unknowns.

Upload collisions (but hopefully not)

“Collisions” are generally, but not always, a cause for concern. They indicate that the Plom system has two page-images both claiming to be the same test-page. We strongly recommend that you look at the images in the collidingPages subdirectory before uploading them.

There are two main ways in which “collisions” might occur:

  • a given test was printed and used more than once — this is very bad and difficult to correct.
  • a given test was scanned twice — this is annoying and points to poor “paper hygiene”
  • a given test-page was rescanned to replace an existing, but poor, scan in the system — this is okay.

In the first two cases, we recommend that you delete those collisions from the collidingPages subdirectory. Images falling into the last case should be uploaded and handled by the “manager” using the plom-manager client. To upload the “collisions” run plom-scan upload --collisions.

Getting a status report

It is sometimes helpful to check what papers have and have not been uploaded. It is also very helpful to see if any papers have been partially uploaded. To get such a status-summary, run status. You will get a simple report such as

Please enter the 'scanner' password:
Test papers unused: [10–40]
Scanned tests in the system:
        1: [1–6]
        2: [1–6]
        3: [1–6]
        4: [1–6]
        5: [1–6]
        6: [1–6]
        7: [1–6]
        8: [1–6]
        9: [1–6]
Incomplete scans - listed with their missing pages:

Something went wrong