Some old math videos
This set of videos were created using Theano (just for the optimization of computations) for a small talk on PDEs on curved spaces.
This set of videos were created using Theano (just for the optimization of computations) for a small talk on PDEs on curved spaces.
I just want to explore a simple and meaningles question: Does BERT encode the length of a sentence in the norm of its [CLS] embedding?