We will first showcase two cutting-edge applications of modern Artificial Intelligence (AI) in biomedical imaging and in reasoning. These applications, as well as most modern AI applications (e.g., ChatGPT, AlphaFold, AlphaGO, Google Translate, Self-Driving Cars) are based on deep learning, a modern rebranding of neural networks, or connectionist methods, dating back to the 1980s, or even the 1950s. We will then briefly review the neuroscience-inspired, tortuous, historical path that has led to deep learning, and the key discoveries made along the way, highlighting the synergies and discrepancies between neuroscience and deep learning. One key conclusion is that approximate gradient descent is essential for learning. However, the standard gradient descent algorithm of deep learning called backpropagation is not biologically-plausible for multiple reasons. We will examine these reasons one-by-one and identify biologically-plausible solutions for each one of them. In particular, we will introduce and demonstrate a general class of neural architectures and learning algorithms capable of learning from data in a largely unsupervised and asynchronous manner, without the need for symmetric connectivity.