We consider models of superconductivity where pairing originates in gain of kinetic rather than potential energy of the carriers. In such systems, a change in the frequency dependent conductivity occurs at frequencies much higher than the scale set by the superconducting energy gap. This property follows from a general sum-rule argument. To clarify the physical origin of this spectral weight transfer we consider several microscopic Hamiltonians that give rise to an effective Hamiltonian with a kinetic pairing interaction. These models describe small polarons with a non-linear interaction with a background degree of freedom, that gives rise to an effective mass enhancement that depends on the local charge occupation. Superconductivity in these models can be understood as arising from the partial ``undressing'' of carriers that occurs upon pairing. The same ``undressing'' occurs in these systems upon doping in the normal state, which causes superconductivity to disappear at high doping. Optical conductivity is calculated for one of the model Hamiltonians to illustrate the effect. It is suggested that some of the observed phenomenology of high T_c oxide superconductors resembles the behavior of the class of superconductors discussed here.