Someone can find all these issues on 200 lines of code :) see sibling comment

nuc1e0n · on Sept 15, 2023

This isn't my code, but gives you an idea of the level of complexity involved. But don't reimplement what you don't need to.

https://github.com/mafintosh/csv-parser/blob/master/index.js

x86x87 · on Sept 16, 2023

To be clear my comment was meant as a joke.

Looking at the parser I see a few problems with it just by skimming the code. I'm not saying it wouldn't work or that it's not good enough for certain purposes.

nuc1e0n · on Sept 16, 2023

Oh yeah? Such as? What purposes do you think it wouldn't be good for? The author will probably be interested in your feedback. Apparently it's getting over a million downloads a week on npm.

x86x87 · on Sept 16, 2023

I have not used it, so this is mostly speculation but i would be curios around character set handling, mixed line ending handling, large file handling, invalid and almost valid file handling.

You can pick on some of the corner case issues here: https://github.com/mafintosh/csv-parser/issues Also look at ones that were solved. https://github.com/mafintosh/csv-parser/pulls

Some interesting ones: https://github.com/mafintosh/csv-parser/pull/121 https://github.com/mafintosh/csv-parser/pull/151 https://github.com/mafintosh/csv-parser/issues/218

The author of the library probably has learned, the hard way many many lessons (and probably also decided to prioritize some of the requested issues / feature requests along the way).

The above is not meant as a ding on the project itself and I am sure it is used successfully by many people. The point here is that your claim that you can easily write a csv parser in 200 lines of code does not hold water. It's anything but easy and you should use a battle tested library and not reinvent the wheel.

nuc1e0n · on Sept 16, 2023

If you had read my original comment, you would see I didn't claim it's easy to do, only that it can be done in around 200 lines. That's clearly the case.

Character set handling isn't really an issue for JavaScript as strings are always utf-16. When a file is read into a string the runtime handles the needed conversion.

As for handling large files, I've used this with 50mb CSVs, which would need a 32bit integer to index. Is that large enough? It's not like windows notepad which can only read 64kb files.

efreak · on Sept 17, 2023

Windows notepad can read multiple megabyte files. It can read files that are hundreds of megabytes. It's not pleasant, loading is incredibly slow, and resizing the window when reflow is enabled makes it take that much longer, but it's definitely possible.

x86x87 · on Sept 17, 2023

My point was that it's not trivial and it's hard to get it right. The way I read your comment was that it's not hard and can easily be done in 200 lines. It's possible I misread it.

I think the original point I was making still stands.