Tuesday, April 29, 2008

Collective intelligenceImage via WikipediaAfter my blog post last month, I thought to myself, "Girl! It is SO time to move on. Enough with the 2.0 stuff." With that, it was on to Semantic Web concepts. YouTube has been the most helpful as some of the W3C "tutorials" are, well, a little dry. (Not all of them just some of them.)

Here is a list of some of the videos I really enjoyed:
Tim Berners Lee on the Semantic Web
An Introduction to the Semantic Web
RDFa Basics
Google Tech Talks: Semantic Web

After my head stopped hurting from the first 20 minutes of the Google Tech Talk, I found this article which further explains RDFa.


In his video on Semantic Web (which is his concept) Tim Berners Lee describes what I perceive as being the core of Semantic Web technology. The Semantic Web is a method of knowledge representation coupled with the collective intelligence that seems to be reaching some level of maturity in web applications.

That may seem like a fairly small statement, but there is really a lot going on that makes this possible.

1. Advances in hardware have taken us from Pentium III to Quad core in just a few years. Processing the larger amounts of data with the complex structure of collective intelligence is no longer limited by hardware.
2. Cloud computing allows more developers to focus on collective intelligence problems. There's really no need to spend a day futzing around with server/network crap anymore, unless it's just something that turns you on (and that's ok.)
3. There are enough CS students who see the value in going on to grad studies due to a globalized market for developers.
4. Web technology is finally past the hump of, "gee, look at my cool home page with the spinning

All of these factors are combining to take internet applications past the document.

I could go on, but this blog is starting to get lengthy.

For this blog, I've started using Zemanta. It's a Firefox plugin that makes suggestions for whatever it is that you are writing about. It's showing pictures and links to articles. It's also suggesting tags for this post. I think the content is from creative commons. That's how they get around copyright. The picture I've posted is from Zemanta, but the links are ones that I pulled from previous wanderings.

I realy like the feature that links phrases in the post to wikipedia.



Friday, April 4, 2008

Previously, I wrote about one of my favorite web 2.0 apps, Pandora.

After I wrote that post, I read through a chapter in my new favorite book Programming Collective Intelligence by Toby Segaran.. The chapter was about searching and ranking using a neural net. It greatly reminded me of Pandora. The user inputs the name of an Artist or a song. Each artist/song are entries in Pandora's Music Genome Project which associates each artist/song with different musical characteristics. Pandora will initially suggest music based on these characteristics. That seems like straight-up database to me...bfd. What's interesting is that once the music starts playing, the user has the option to give each song a "thumbs-up" or "thumbs-down." That's training! For most of the semester, I assumed that Pandora must be using a neural network for this training. Then today, I start reading the chapter in Programming Collective Intelligence on filtering documents. There are other ways to train that are not as computationally expensive. This really leaves me to wonder. Aside from pondering the ways in which Pandora is suggesting tunes, I'm very curious about how one would unit test and system test these types of things.

It's been so easy to stick with Web 2.0 concepts this semester. The information I've found about the Semantic Web has been fairly sparse. From what I can tell, it's all about filtering from an extremely wide net. For example,a few days ago, I found out about Zemanta It's a semantic application that suggests content for blogs. The user types along and while they are writing their blog, Zemanta suggests content for the user. The technologies I've looked at for the Semantic web seem to revolve around markup and data representation.

T