I suspect there may be a method by which one could convert solved sudoku's into ...

SomeStupidPoint · on June 12, 2017

My immediate thought was to generate training data by making up solved puzzles, and then removing numbers one at a time (while ensuring there's still a unique solution), with a few different branches at each step from an initial solution.

You could generate massive datasets that way -- it would be pretty easy to generate a few billion pre-images of a solution. I mean, a solution has 80 numbers and a puzzle about 20. 80 choose 20 is about 10^18, or a billion billion.