A group project with Merih Atasoy and Ita Zap. The team trained a neural network to aid in detecting photo manipulation using ResNet-18 for convolutional feature extraction, followed by our own fully-connected layers for binary classification: photoshopped or original.
The data used is the public Reddit Photoshop Battle dataset which was a contest for Reddit users to post manipulated images. Members of the University of Basel summarized the data statistics in a report and provided a GitHub repository to download the labelled photos. The dataset was diverse in photoshop techniques, such as splicing: combining parts of 2 different images, and copy-move: rearranging parts of the same image.
The entire 40GB dataset comprises of approximately 11,000 original and over 100,000 photoshopped images. The photo heights range from 136-20,000 pixels represented by the histogram in Appendix A. It was downloaded using a script published on GitHub. Due to resource constraints, only about half was obtained.
The data is labelled well; the original images are named with a unique code, and the photoshopped versions include the original image code and the derivative number. A comprehensive diagram of the software structure can be seen below in the figure. The structure can be summarized as follows:
Image Preprocessing: Each original image is concatenated with one manipulated derivative. 2000 images are split into subsections of 60/20/20 for the training, validation, and test sets.
ResNet Features: Each concatenated image is split, the ResNet features are loaded, and the tensor of features is concatenated.
DataLoader: 3 Data Loader functions are created to load either unfiltered images, High-Pass filter images, or Low-Pass filter images.
Train Function: In each batch of images, the tensor representing the image Resnet features are split into the original and derivative versions. The loss function used is Cross-Entropy Loss and the optimizer is SGD.
Hyperparameter Grid Search: Iterates over different filter options, batch sizes, weight decay values, learning rates, and learning rate decay values.
The chosen model’s training, validation, and test accuracy was 91%, 74%, and 70%, respectively. The test accuracy being close to the validation accuracy assures that the model has learned valuable features and is not overfitted to the idiosyncrasies of the training set. Additionally, it beat the 50% accuracy of the base model. Comparison of the different model types is summarized in Table 1 in the linked report.
For our full report, check out the documentation on Github and also go to this project's place on the web by visiting the project link.