Dropout の論文 Dropout: A Simple Way to Prevent Neural Networks from Overfitting の 「6.5 Comparison with Standard Regularizers」あたりを見ればいいと思います。
ニューロンが他のニューロンと依存関係を持つ co-adaptations が起こると過学習につながるのではないかと仮説を立て、それを防ぐために Dropout を考案したと書かれています。
In a standard neural network, the derivative received by each parameter tells it how it should change so the final loss function is reduced, given what all other units are doing.
Therefore, units may change in a way that they fix up the mistakes of the other units.
This may lead to complex co-adaptations.
This in turn leads to overfitting because these co-adaptations do not generalize to unseen data.
これに対して、Dropout がどう効果と持つかについて述べた部分
We hypothesize that for each hidden unit,
dropout prevents co-adaptation by making the presence of other hidden units unreliable.
Therefore, a hidden unit cannot rely on other specific units to correct its mistakes.
We found that as a side-effect of doing dropout, the activations of the hidden units become sparse, even when no sparsity inducing regularizers are present. Thus, dropout automatically leads to sparse representations.