Sam Stites

Data augmentation is a hack (albeit nessecary)

Ideally, models are invariant to translation, rotation, and scaling. You would hope that your learning model identifies you both in close-up photos, as well as in a group shots (even if one is just a photoshopped version of the other).

That said, convolutional layers are only translation-invariant – and only somewhat translation-invariant at that! This is where data augmentation comes in – making sure that our model is trained on data which would make models robust to translation, rotation, and scaling (hopefully resulting in invariance). In the end it’s really just a hack that we all live with, everything works well enough with our current hack in place, but there are attempts like Capsule Networks which attempt to chip away at this hack.

Questions arise: