Thursday, November 6, 2008

Readings for Week 10

Web Search Engines

I thought this article was interesting. It is so amazing to me how quickly a system can filter through so much information and supply results in a few seconds. The only problem is the relevance of the results of the search or inquiry. The information about indexing algorithms and so on went a bit over my head, but I believe I understood the general basic information the article was trying to supply.

OAI Protocol for Harvesting Metadata

This article discussed trends and developments in harvesting metadata. The OAI world is divided into either data providers or service providers. It also discussed the challenges of the service providers and ideas to cut down on the problems. I liked reading about the different projects universities around the country are doing, especially the four that are digitizing sheet music. Overall I thought this article was pretty interesting.

The Deep Web

This article discusses the problems with being able to find relative information with the wealth of information available. Traditional search engines could not retrieve information in the deep web, but according to the author, the value of the deep web is immeasurable. I was curious what exactly is considered "the deep web." In looking at the chart, a few sites that are considered deep web I found surprising, such as eBay and Amazon. The article was interesting, but the more I read the more I felt that "the deep web" was a mystical place or similar to the black hole. Maybe there will be a cheesy sci-fi thriller entitled "The Deep Web" or maybe that is Captain Ahab's next adventure.

Muddiest Point

With mark up, do you have to use a code to indicate color, like HTML, or just write it? I was also curious if you needed a HTML basis before you use mark up?

2 comments:

Petunia said...

I'm no expert but I think the idea of the deep web is: there are are surface web pages -- all of their information is placed directly at those access points. This is the information that web engines like Google can readily access. However some organizations put some information on these "surface pages" but the rest of their information is stored in databases. Information *about* the database is linked to the web but not the actual information contained *within* the database. So the highway is the surface web and the deep web are the exits that lead to the back roads. Traditional search engines can only travel the highway and see the signs for the exits but they cannot actually get off on one of the exits and travel the back roads.

rjz said...

yes, deep web crawling is fascinating. how exactly is ebay considered the deep web is what I'd like to know. Wouldn't you?