Uncertainty due to model itself, the model doesn’t represent the data completely because we have lack of information (obviously we have a finite set of data).
Uncertainty inside the data itself. I think of it as measurement error in sensors.
The noise is not independent of the input features. For example, in Hadron Collider, the data have more noise in some areas(possibly the when the particles are faster), and less noise in some other areas. More easily, noise depends on X.
The noise is independent of X.