28GB is really too small for Hadoop to get out of bed for, in general. Though I ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		rsynnott on May 23, 2018 \| parent \| context \| favorite \| on: Command-line Tools can be 235x Faster than a Hadoo... 28GB is really too small for Hadoop to get out of bed for, in general. Though I would wonder why it was _that_ slow with Spark (or Hadoop, for that matter).

mr_toad on May 24, 2018 | [–]

Spark is going to try and ingest all the data, and it won’t fit in RAM. Wrong tool for the job basically.

rsynnott on May 24, 2018 | | [–]

Depends on exactly how you do it, I suppose, but it shouldn't necessarily. Most Hadoop-y work can also be accomplished in Spark without much fuss.

sfifs on May 24, 2018 | [–]

Because it was running on a laptop instead of on a cluster :-)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact