Establishing linkages between ML and Differential Geometry is intriguing (to say the least). But I have this nagging sense that "data manifolds" are too rigidly tied to numerical representations for this program to flourish. Differential geometry is all about invariance. Geometric objects have a life of their own so to speak, irrespective of any particular representation. In the broader data science world such an internal structure is not accessible in general. The systems modeled are too complex and their capture in data too superficial to be a reflection of the "true state". In a sense this is analogous to the "blind men touching a elephant in different parts and disagreeing about what it is".
I'm not sure I agree about the data manifolds being too rigid. When we look at the quality score-based generative models and diffusion we can see a clear evidence of how flexible these representations are. We could say the same about statistical manifolds, but the fact that the Fisher is the fundamental metric tensor for the statistical manifold is a fundamental piece of many 1st and 2nd order optimizers today.
1 comments