Maybe some dirt inside the data? Encoding, quoting or structural problems?
Not exactly sure about the encoding, but they're CSVs that have been outputed normally.
Opening the files (and playing with the data) in Excel doesn't show any issue.
The one that's 112K rows has 39 columns, the one with 1.5M rows has 26 columns.
A suggestion here about "move lines around in the file", got me to try sorting the data differently (Descending instead of Ascending) and that may have fixed it, but it'll take some new data to truly confirm.
[UPDATE: Nope. Sorting changes which records are included in the extract, but it's still an incomplete set. Sorting by date did allow me to see that the records are being loaded chronologically and where the records stopped. There's nothing odd about that specific row that would make me think it is causing an issue]
I suppose this suggests that it's an issue on the data side of things...