What is Big Data, and why you could care

January 8, 2011

That’s the title for my talk this coming Thursday¬† night (January 13th, 2011), at the first “Tech Talk” sponsored by Sierra Commons. Details are at http://sierracommons.org/2011/local-business/2108.

Erika Kosina has done a great job of setting this up, and I’m looking forward to meeting more of the local tech community.

Some of the things I’ll be touching on are:

  • Why big data is the cool new kid on the block, or why I wished I really knew statistics.
  • Crunching big data to filter, cluster, classify and recommend.
  • How big data + machine learning == more effective advertising, for better or worse.
  • The basics of Hadoop, an open source data processing system.

Hope to see you there!



Big Data talk for technical non-programmers

December 22, 2010

I’ve going to be giving a talk on big data for the newly formed Nevada County Tech Talk event – a monthly gathering at Sierra Commons.

Unfortunately most of the relevant content I’ve got is for Java programmers interested in using Hadoop. Things I could talk about, based on personal experience:

  • A 600M page web crawl using Bixo.
  • Using LibSVM to predict medications from problems.
  • Using Mahout’s kmeans clustering algorithm on pages referenced from tweets (the unfinished Fokii service).

I’m looking for relevant talks that I can borrow from, but I haven’t found much that’s targeted at the technically minded-but-not-a-programmer crowd.

Comments with pointers to useful talks/presentations would be great!