Just got home finally from PHP Appalachia. I enjoyed meeting
all the great people.
I presented about what I learned and how we deal with importing
large amounts of CSV data into MySQL. I threw my idea onto
the wiki at the last minute, made the slides while everyone ate
breakfast and I had planned on researching it all (been a few years
since I wrote it), but we had no reliable internet. Some
claims I made and their corrections.
I said our largest file is about 1.8 million lines.
WRONG. Actually it is about 4.6 million. I was
correct however that it does finish importing and indexing in
about 5 minutes.
I claimed I LOAD DATA INFILE to MyISAM first and then
"insert into ... select from" into an InnoDB table for speed
reasons. WRONG. In fact, I do that because I need
to merge fields from the file sometimes into one field in the
databaes. I could not find a way to do that with LOAD
DATA INFILE. As to speed. I can't say either way as
I have no solid data. Sounds like a good test.
MyISAM probably still wins on a LOAD DATA INFILE into a blank,
fresh table based on my experience.
Total rows currently indexed is 7.2 million. I did
not make a claim, but I thought I would just mention
that. I wanted to include that, but did not have
Internet. (Damn you Hughes)