From the outside in, keeping existing features alongside new features *sounds* like a good idea. But digging a little deeper some problems start jumping out.
Continuously retaining the old legacy code leads to bulky, overweight programs which are both larger, taking more disc space, and requiring more work to diagnose and update. Where problems exist, for every old/new option that doubles the potential paths to tracking down problems.
New features would be expected to work with old data, so not only do they have code for dealing with new data they must somehow work around the old data as well. Or at least gracefully inform the user "I can't do that, Dave". Although even that may be seen as ungraceful by some users.
The oldest features are based on tools and frameworks which are being deprecated in the newest operating systems. This means that those old features may simply not be feasible with new releases.
Not to say that all of these apply everywhere, all the time. But there are reasons for RISC (reduced instruction set computing) style of development.
----------------------------------
If you are going to fly by the seat of your pants, expect friction burns.
"I don't know" is the beginning of knowledge, not the end.