It’s time to change from Mechanical thinking to Biological thinking — the Secret behind Deep learning

Haohan Wang
7 min readMay 2, 2017

In order to apply the principles that underpin the miracle of the human neural system, we first need to think differently about models and algorithms. Typically, when we analyze the regular machine learning(ML) and modeling process, it is not hard to find out that we are using the ‘mechanical thinking way in which we set up the goals, we analyze the problems, we collect the information and we construct and adhere to the plans to generate the output. By doing so, we can get stress efficiency and short-term performance. We have to admit that this is splendidly practical and effective method to deal with relatively simple challenges in relatively stable environments and this is the most common way that we adopt to solve the problems every day:

Even for the classic Machine Learning algorithm the structure is pretty similar except that we allow the algorithm to map from the hand-coded features here:

In fact, it was a good enough mental model for business and regular modeling overall until the environment/situation gradually become far more dynamic and unpredictable.

How can we come up with the new mental model to handle these more complex situations that we now increasing face?

The thinking started to change when we come to the representation learning stage in which we no longer rely on hand-designed representation but instead we allow the model to ‘learn’ the representation by itself. For Deep Learning(DL) especially, it can learn the representation and express them in terms of other and simpler representation. I believe the outstanding performance of representation learning and deep learning starts to lead us to reconsider their fundamental differences from other ML models and it also draws our attention to the other type of thinking — Biological thinking. The benefits of biological thinking could be illustrated as below:

Biological systems are distinct from many physical systems in that they have a history. Living things evolve over time. While the objects of physics clearly do not emerge from thin air — astrophysicists even talk about the evolution of stars — biological systems are especially subject to evolutionary pressures; in fact, that is one of their defining features. The complicated structures of biology have the forms they do because of these complex historical paths, ones that have been affected by numerous factors over huge amounts of time.

The Need for Biological Thinking to Solve Complex Problems

Actually, some earliest DL algorithms we recognize today were intended to be computational models of biological learning. I believe DL just keeps proving to and reminding us that the importance to find the right framework which will eventually outperform the others in the long-term.

Let’s put all these graphs together to check the evolvement process:

Now, we need to master the art of biological thinking. In other words, we need to think more modestly and subtly about when and how we can shape rather than control dynamic and complex situations.

If deep learning is so powerful why it is not in the commonplace?

In order to answer this question, it is necessary for us to first go through the history of the DL and the trends of the DL so far.

In fact, DL dates back to the 1940s and it was known as Cybernetics from 1940s — 1960s. From 1980s — 1990s, DL is known as Connectionism and the current resurgence under the name deep learning beginning in 2006.

As I mentioned above, some earliest DL models are actually a computational model of biological learning, that is, models of how learning happens or could happen in the brain. As a result, one of the names of Deep learning is Artificial Neural Networks(ANNs). Here I put 3 keywords together in Google Trends: deep learning, Artificial Neural Networks and Tensorflow(developed by Google to meet their needs for systems capable of building and training neural networks to detect and decipher patterns and correlations, analogous to the learning and reasoning which humans use and it was announced at November 2015). We can see that the dramatic upward trends started around 2015 of all three words.

It worths to mention that deep learning models were inspired by biological brain while the kinds of neural networks used for ML are not designed to be realistic models of biological function. In other words, brain provides a proof that intelligent behavior is possible and a conceptually straightforward path to building intelligence is to reverse engineer the computational principles behind the brain and duplicate its functionality. The modern term ‘deep learning’ goes beyond the neuroscientific perspective which can be applied in machine learning frameworks that are not necessarily neurally inspired.

Here is also a subject which dedicates to understand how the brain works on an algorithmic level. This endeavor is primarily known as ‘computational neuroscience’ and it is a separate field of deep learning.

Neuroscience has given us a reason to believe that a single deep learning algorithm can solve many different tasks. Scientists have found that much of the mammalian brain might use a single algorithm to solve most of the different tasks that the brain solves. As a result, it is common for DL research groups today to study many or even all these applications(natural language processing, vision, motion planning and speed recognition) areas simultaneously. The basic idea of having many computational units that become intelligent only through their interactions with each other is inspired by the brain. The neocognitron introduced a powerful model architecture for processing images that was inspired by the structure of the mammalian visual system and later became the basis for the modern convolutional network.

The earliest predecessors of modern DL were simple linear models motived by a neuroscientific perspective. Basically, the models were designed to take a set of input: x1, x2,….,xn and associated them with an output y. These models would learn a set of weights: w1, w2,…,wn and compute their output f(x,w) = x1w1 + x2w2 + ….+ xnwn. This first wave of neural networks research was known as Cybernetics.

In the 1980s, the second wave of neural network research emerged in great part via a movement — Connectionism/Parallel distributed processing which arose in the context of cognitive science. Cognitive science is an interdisciplinary approach to understanding the mind, combining multiple different levels of analysis.

Several key concepts arose during the connectionism movement of the 1980s that remain central to today’s DL:

  1. Distributed representation: this is the idea that each input to a system should be represented by many features, and each feature should be involved in the representation of many possible inputs.
  2. Back-propagation: the method used to train deep neural networks with internal representations and the popularization of the back-propagation algorithm.
http://www.andreykurenkov.com/writing/a-brief-history-of-neural-nets-and-deep-learning/

Based on the Google trends graph that I showed before, one may wonder why deep learning has only recently become recognized as an important technology. The answer is simple — today we can provide these algorithms with the resources they need to succeed. The age of ‘Big Data’ has made ML much easier since the key barrier of statistical estimation — generalizing well to new data after observing only a small amount data. Based on the rule of the thumb, the DL algorithm will generally achieve acceptable performance with around 5000 labeled examples per category and it will enable the algorithm to achieve the performance that is as good as or even better than human’s when trained with a dataset containing at least 10Million labeled examples. Right now, we can still take advantage of large quantities of unlabeled examples with unsupervised or semis-supervised learning.

Another key reason is that we have the computational resources to run much larger models today. In terms of the total amount of neurons, the neural network has been quite small until recently. Inspired by the connectionism, an individual neuron or small collection of neurons is not particularly useful. Increasing neural network size over time and the growth is driven by faster computers with larger memory and by the availability of larger datasets. Larger networks are able to achieve higher accuracy on more complicated tasks.

Summary:

  • DL has had a long and rich history and few different names, but has waxed and waned in popularity;
  • DL is becoming more and more useful as the amount of available training data sets and computational resources have increased.
  • DL models have solved increasingly complicated applications with increasing accuracy over time.
  • DL is an approach to ML that has drawn heavily on our knowledge of the human brain, statistics, and applied math.
  • In recent years, DL has seen tremendous growth in its popularity and usefulness, largely as the result of more powerful computers, larger datasets, and techniques to train deeper networks.

Reference:

http://www.neuraldump.com/2016/03/introduction-to-neural-networks/neuron_diagram/

https://www.farnamstreetblog.com/2016/09/biological-thinking/

<Deep Learning> (Adaptive Computation and Machine Learning series)

https://www.ted.com/talks/martin_reeves_how_to_build_a_business_that_lasts_100_years#t-780376

http://www.andreykurenkov.com/writing/a-brief-history-of-neural-nets-and-deep-learning/

--

--