Skip to content

Source code similarity detector v1.0

Compare
Choose a tag to compare
@mikkomaran mikkomaran released this 09 Apr 18:17
· 28 commits to master since this release

Description

JavaFX application for detecting similarity between Python source code files using Levenshtein distance as a metric. Presents the results as similar clusters and pairs.

Requirements

  • Java JDK version 11+
  • Tested & working on Windows OS

Running the application

Download the Source.code.similarity.detector.jar file and run the executable JAR. If double-clicking the JAR doesn't start the program then try the the following commands from the command line:

  • javaw -jar "path/to/file/Source code similarity detector.jar"
  • javaw -cp "path/to/file/Source code similarity detector.jar" ee.ut.similaritydetector.ui.App

Input files

The application takes input as a ZIP file that is generated from Moodle submissions. The ZIP file contains folders for each student named by the student's ID code and name separated with an underscore (i.e. "MM_Mikko Maran"). Each student folder contains submission files of every exercise (i.e. "exercise1.py", "exercise2.py",...)

Features

  • Custom similarity threshold - the user can select a similarity threshold, that is the percentage of similarity for two solutions to be considered suspiciously similar
  • Preproccesing source code files - all comments and empty lines will be removed from source codes before starting the analysis
  • Anonymous results - the results are presented by student ID codes rather than names
  • Code review window - allows reviewing the source codes of suspicious solutions with syntax highlighting
  • Analysis statistics
  • Light & dark theme