The whole premise of taking NAND from 25nm to 19nm (for instance) is to fit more floating gates in the same area. You can take that as a smaller die, or as more bits on a slightly larger die than the previous generation.
Die size is indeed a major factor on cost. For a given technology (litho node + process, e.g. # and type of steps), the cost to process a wafer is fairly constant regardless of die size.
If you shrink a die size, you fit more die on a wafer. Additionally, yield goes up (given an independent manufacturing defect density), and especially for large die, the tessellation around the edges has a major impact.
Indirectly, even testing is related to die size, in that there is a limit to tester parallelism, and more gates means more time and more combinatorial patterns to test, e.g. for stuck-at testing.
There are of course non-linear costs in packaging and package-level testing and elsewhere.
The per unit costs of a chip[1] are essentially constant per wafer. The smaller a chip is, the more of them each wafer will produce. Also, the smaller a chance there is that any given chip will be ruined by an imperfection. So then number of chips you get from each dollar of production cost is a bit less than linearly inversely proportional to the area of the chip. There are other factors too, though, and wafers from more advanced nodes will tend to be more expensive. However, the two examples being compared here were both from the 25nm nod.
[1] Which dominate with memory since production runs tend to be large and the regular patterns make for less design investment than, say, a CPU.
It's always been related, but it's not the only factor. Packaging, assembly and test are the other primary components of manufacturing cost. You are correct though, that price (as opposed to cost) is going to include amortized R&D, marketing, license fees, margin etc.