With hindsight, one can say that the initial successes were also due to the simplicity of the sources mapped. It is now clear that one should not be applying this method to an extended source which covered several times the resolution limit (the width of the central peak of the dirty beam). Such a source could have a broad, gentle maximum in the dirty map, and subtracting a narrow dirty beam at this point would generate images of the sidelobes with the opposite sign. This would generate new maxima where new CLEAN components would be placed by the algorithm, and things could go unstable. One precaution which certainly helps is the ``gain factor'' (actually a loss factor since it is less than one). After finding a maximum, one does not subtract the full value but a fraction typically 0.2 or less. In simple cases, this would just make the algorithm slower but not change the solution. But this step actually helps when sources are more complex. One is being conservative in not fully believing the sources found initially. This gives the algorithm a chance to change its mind and look for sources elsewhere. If this sounds like a description of animal behaviour, the impression being conveyed is correct. Our understanding of CLEAN is largely a series of empirical observations and thumb rules, with common sense rationalisations after the fact, but no real mathematical theory. One exception is the work of Schwarz (A&A 65 345 1978) which interpreted each CLEAN subtraction as a least squares fit of the current dirty map to a single point source. This is interesting but not enough. CLEAN carries out this subtraction sequentially, and that too with a gain factor. In principle, each value of the gain factor could lead to a different solution, i.e a different collection of CLEAN components, in the realistic case when the number of points is less than the number of resolution elements in the map. So what are we to make of the practical successes of CLEAN? Simply that in those cases, the patch of the sky being imaged had a large enough empty portion that the real number of CLEAN components neeeded was smaller than the number of data points available in the plane. Under such conditions, one could believe that the solution is unique. Current implementations of CLEAN allow the user to define ``windows" in the map so that one does not look for CLEAN components outside them. But when a large portion of the field of view has some nonzero brightness, there are indeed problems with CLEAN. The maps show spurious stripes whose separation is related to unmeasured spatial frequencies (that's how one deduces they are spurious). One should think of this as a wrong choice of invisible distribution which CLEAN has made. Various modifications of CLEAN have been devised to cope with this, but the fairest conclusion is that the algorithm was never meant for extended structure. Given that it began with isolated point sources it has done remarkably well in other circumstances.