What is Big Data, and why you could care

January 8, 2011

That’s the title for my talk this coming Thursday  night (January 13th, 2011), at the first “Tech Talk” sponsored by Sierra Commons. Details are at http://sierracommons.org/2011/local-business/2108.

Erika Kosina has done a great job of setting this up, and I’m looking forward to meeting more of the local tech community.

Some of the things I’ll be touching on are:

  • Why big data is the cool new kid on the block, or why I wished I really knew statistics.
  • Crunching big data to filter, cluster, classify and recommend.
  • How big data + machine learning == more effective advertising, for better or worse.
  • The basics of Hadoop, an open source data processing system.

Hope to see you there!

 


Big Data talk for technical non-programmers

December 22, 2010

I’ve going to be giving a talk on big data for the newly formed Nevada County Tech Talk event – a monthly gathering at Sierra Commons.

Unfortunately most of the relevant content I’ve got is for Java programmers interested in using Hadoop. Things I could talk about, based on personal experience:

  • A 600M page web crawl using Bixo.
  • Using LibSVM to predict medications from problems.
  • Using Mahout’s kmeans clustering algorithm on pages referenced from tweets (the unfinished Fokii service).

I’m looking for relevant talks that I can borrow from, but I haven’t found much that’s targeted at the technically minded-but-not-a-programmer crowd.

Comments with pointers to useful talks/presentations would be great!