Hi, I teach a CS course, and I was wondering if there is a practical way in which to setup a server that would accept student’s tar files, run some tests, and show them the results.
I could go “full unix mode” and roll up some accounts let them ssh into a server, scp their their files… but I was wondering if there is a prepacked solution for this that is nicer to the eye. And I thought maybe you know some.
I actually work for a large university in the digital education department. We do have tools like this but I’m pretty sure it’s for python. It could probably be modified for other uses however. I’m a hardware guy or I’d know more about it. If you’re interested I could probably get some more info or get you in touch with the devs that created it. DM if you want some more details.
You could definitely build something like this. You definitely want either human review before execution or a fair amount of sandboxing for whatever your students submit.
Do you want students trying to brute force or exfiltrate whatever test data lives in the server? If not, either they should just have the test cases already, or they can get back how many/which of the secret test cases they passed along with their grade, so showing them the results live might not be so important. Unless you want something like “you have 3 tries to pass the secret tests so you can get a hint that your own tests missed a case and go back and try to guess what it was”.
You also might want to invest time first in test harnesses for the students to run themselves, because you want them to learn good practices like coding against a test suite. If nothing else it makes it easier to make the auto-grader later if the students’ code is all already hooked up to the same test framework.
Teaching students how to use fully use a multi-user Unix system can for some topics put unnecessary faffing about between the students and what they are trying to learn (are you teaching front-end web dev or something?), but in a lot of cases your students might actually be better served by something that makes them touch the deep magic than by a slick web UI that handles everything for them, as long as you turn it into a learning experience and not a protracted period of bafflement.
Does your school not already have some kind of shared CD department server/Unix environment for the students that could get you out of account management?
Also, the Right Way to get the code to the server is probably going to be Git and not a tarball. The students could/should be set up with a Git forge and indoctrinated in never leaving their code on their laptop to be sat upon and lost.
When I was in college, I took a 100-level CS course that required me to ssh into a server and run a command to submit my homework. It’s not crazy.
This is basically what CI/CD pipelines do.
Compile the code, run tests, run static analysis. If results pass, submit the code. If results fail, reject it with an explanation.
Idk the details of how you’d implement this for a class, without letting everyone see eachother’s completed work, but I’m sure it could be done.
My university used Artemis to do basically what you’re describing. Files are uploaded via git. But it seems like selfhosting would be a lot of work.
Hey, that was made at my former uni. And now I’m wondering whether other unis adopted it. It always seemed like a neat solution.
Why give your students a way to get RCE on your institutions servers through anything less than perfect file upload implementation.
For a .tar? I wish you the best…
Instead of that, simplify.
Use unique salts for each assignment per student.
Align hashes with those salts to check the outcome for each students assignment.
Literally have them send you a CTF style sha256 string.
Do it step by step where each step doesn’t depend on the next, grade as a percentage of flags accurately procured.
Absolutely this. Even if you had fancy jails or docker setups for each submission, this will be a nightmare to properly handle. Students DOSing each other exactly before the submission deadline, too.
I mean just for the love of God don’t spin up something on your company’s infrastructure that accepts file uploads.
Just don’t.
If you’re reading this and going “well, it’s just internal,” or “well, it doesn’t do much it just accepts this exact file type.” My god. Ask your CISA. And if they’re okay with it, cool. That’s on them.
Unless your whole business is transferring files, don’t. And even then… Don’t.
And if you’re still confused, the answer is to use another company’s infrastructure for this. Use Azure. Use AWS. Use Google cloud or even g suites. Don’t accept that liability. Let the trillionaires do it.
I mean if you put up an Internet-facing unauthenticated file acceptor it will quickly become stuffed with all sorts of garbage and aspiring malware. You definitely don’t want to hook that up to an untar and exec loop, even with some notion of sandboxing. It will just start mining Bitcoins or sending spam or something.
But if it is built properly, with only authorized users being able to upload stuff, and a basic understanding of not dropping stuff where the web server will happily execute every PHP web shell someone sticks in the slot, and the leverage to threaten people into not uploading pictures of their own or others’ butts or Iron Man (2009), I don’t see why all but the file-uploading professionals should immediately give up.
You can accept them on internal networks, just have a file size limit and don’t extract them locally, but send to some cloud service for handling. You could even have it work with email attachments if you want.
Basically:
- Put file somewhere
- Spin up runner
- Upload and execute code
- Spin down runner either upon success or after a time limit
- Send result to the student (if it to took too long, that’s a fail too)
My first method eliminates waiting to see if your students code runs fast enough. Unless complexity is part of the assignment, I’d still say go for the hash.
It’s also less work for the professor/grader.
You could use automated testing tools to do the work for you. You define your requirements as individual tests and every input is tested separately giving you a report which tests failed and which succeeded.
If you use moodle, it has a plugin for that, with instructions.
If you don’t use moodle, you may want to check the instructions on the plugin anyway.
Did a takehome for a company recently that did it well. They required that I make a docker file (you could give them one if you wanted) where when ran it would run tests. It was a neat use of docker IMO, it standardized that builds were just “build the docker file” and running was just “run the dockerfile”. You would t have to deal with tar or anything then.
Thousand ways to skin a cat there
That sounds like build automation. You can use some Git forge software.
My university had a system like this. They also had all the tests and the expected answers in a file at …/tests.txt and …/results.txt u could read both files look at the command line args passed to ur code find what line in the tests file they passed and return that same line number from the results file. 100% on every single item. They pulled me into a meeting to complain about it but granted me the marks anyways cos I was technically correct according to the marking criteria. Needless to say they fixed the access perms and rewrote the marking criteria for the next year.
Yeah this is a bad idea if not done extremely carefully cos rce. There is CodeRunner as a moodle plugin. If u wanna roll ur own then u would wanna use something like Firecracker for secure execution.
Full unix mode is probably easier than working up some kind of sandboxing mechanism that accepts arbitrary scripts/binaries.
As far as nice to the eye, you can spin up a python FastAPI site and frontend in about 10 minutes with Claude Code





