Saturday, January 28, 2017

Introduction to Search Engines

So you were assigned to create a PowerPoint on the effectiveness of Logos for companies so you hit up a search engine and of course you’re going to find your answer in no time! You type in logos and you get hundreds of hits of companies that are promising you the best prices in helping you create your own company logo. Wait, what? No!!! That’s not what I need! You look at your classmate sitting next to you and within ten minutes she has already printed tons of information and is basically ready to go. Panic sets in and you come into terms with the idea that you will fail this marketing class and you will never graduate from college! Maybe your boss at your part time job at the Tiki Post will consider you for a full time position. Eeeek!






Yes, if you don’t know how to navigate the web in 2017 you might feel a bit, perhaps, inadequate? The question is why? Fear no more! There is no shame in not knowing the basics, and you know what? I guarantee you that you’re not alone! Another interesting fact that you might find comforting is that many web savvy individuals might also benefit from this, your guide for Searching the Web for Dummies! And you’re no dummy at all! Soon, you will be an expert and will no longer have to feign disinterest when people talk about new things out there because you will be well versed in the subject matter. So sit back, relax, pay attention, and get ready to learn!


DISCLAIMER: I too, before taking this tutorial, was not 100% sure of my skills. This is a breakdown of what I learned in a simplified (I HOPE!) version. However, some of the information, because it is practically impossible to simplify, is as it appears on the website. 



Search Engines: a Definition

First Things First…What are search engines?


Of course, the first thing you must understand even if you think you alrleady understand is the definition of what a search engine is. A search engine is a compilation of databases of web page files that are automatically put together by a machine. But what exactly does this mean?

That somebody else is already doing half the work for you. 

These databases hold the information you are looking for and have been assembled in such a way that when you search for something, they look for key words and are able to provide you with what they believe to be the closest option to what you’re searching for. 

Huh? 

Key words are indexed and filed so that they are easily recognized within each search. These engines try to keep updated so that the results you get are the most recent. If you look closely, after you search, you get your results and they are divided by tabs. One includes everything that is found, another tab holds images, another videos. When it does this, it allows you to distinguish between what is new and what has been there for a while. 

Easy, huh?


Undoubtedly search engines are great because they, well, search a large portion of the web and they’re obviously the best method that exist today to search the web, but they also have their disadvantages. Bummer! Because of the vast amount of information that is out there, you have to be extremely meticulous about how you key in your searches because you will get hundreds of hits since these engines are looking for keywords. This is when we run into the trying to determine which hit is the best one and well, that can be rather time consuming.



Another thing that you must knwo is that there are two different types of search engines: The individual which compile their own searchable databases on the web (explanation will follow) and the Metasearchers which do not compile databases but instead search the databases of multiple sets of individual engines simultaneously (will follow). 

Metasearchers: a Definition

Note: If as you read through these sections, you feel a bit overwhelmed, don't worry. That is perfectly normal. If all this information doesn't make sense to you yet, it will as you continue reading. We promise!

Okay, here it is: 

As the prefix “meta” suggests, it goes beyond what an ordinary search engine does and it goes 
“beyond” by searching the multiple sets of individual search engines simultaneously. Metasearch engines let you know which engines are retrieving the best results for you in your search.


Huh? I know! Here’s a breakdown:

These engines give you the result in either a single list or multiple lists. Most give you the single list which, like its name suggests, is just one list with no duplicate entries. The multiple lists gives you multiple results displayed in separate lists according to the search engine.
The main thing to understand here is the Pros and Cons. 


The biggest pro is that these metasearch engines are super-fast.  Which is what we all want...Awesome! Right? Well, this is too good to be true because the major con in relation to their speed is that the results it gives back can be hundreds and hundreds of hits and so you’re stuck trying to eliminate the good sources from the bad. No Bueno!



Examples of these engines are Dogpile and Mamma.

Subject Directories: a Definition



Feeling Overwhelmed? Again, we will tell you the same thing, don’t be! We are still breaking all of this down for you so that you can process it slowly.

Let’s talk about subject directories. These directories are controlled by humans. Editors review and select sites to include in their directories that follow specific criteria for selection. These directories work with keywords that are matched with the written descriptions provided by the editors behind everything. This process is time consuming for them, but it makes it more efficient for you. These editors usually just index the home page or the top levels of the site. So this means that your keywords are matched with all kinds of other directories such as general, academic, commercial, portals and vertical portals. Portals link popular subject categories, and offer other services like email, current news, stock quotes, travel information, and maps. You will learn about vortals in the following chapter.

A definite pro of subject directories is that it gets fewer hits, therefore; it is more likely to get more accurate results. Less time consuming, right? Unfortunately, a most definite con is that it very often provides dead link results.

Links to Subject Directories

Links to Portals (subject directories serving as home pages)


Library Gateways and Specialized Databases: a Definition

In this chapter you will learn about the sites that you want to be familiar with if you are researching topics that will aid you in for example, assignments such as research papers or informational presentations. Why? Wait and you will see...

There are two kinds of databases: Library gateways and portals (See previous chapter).


Library gateways are collections that contain databases and informational sites that have been arranged by subject, assembled, reviewed and recommended by specialists, usually librarians. The results you get are academically oriented pages on the web--like we said before, these are the ones you want to target when the information you need needs to be extra reliable. 

 If you ask us, Library Gateways are similar to our old fashioned, good ole' books. If you are older than thirty-five, you might remember the agony of doing your research paper during your senior year in high school and having to predominantly use books for sources because your high school library was barely getting Internet. Yup! That might have been us! 




Now what are these vortals we speak of? They are subject specific databases.

Subject-specific databases, or vortals (i.e., "vertical portals") are databases that are a bit more specific since they are devoted to a single subject and are created by subject experts such as professors, researchers, governmental agencies, business interests, and other subject specialists and/or individuals who have a deep interest in the subject and professional knowledge of a particular field.

If you ask us, these are the databases you should be hitting up since they are the more reliable sources.

Now don’t get confused because we are about to uncover a top secret little something we learned about the web: THE INVISIBLE WEB. Wait! What? Do you have an app on your smartphone that hides specific data, whether it is passwords, or confidential files? Well, the invisible web is somehow similar to this.

About 60 to 80 percent of existing web material is made up of thousands of documents that are hidden behind password protected sites, firewalls, and are archived. Erroneously, people assume that such documents are irretrievable. Although they are not visible to search engine spiders, today’s search engines are learning to find and index the contents of these “Invisible Web” pages. To find them you must point your browser directly at them and that’s what the library gateways and subject-specific databases do.

So really, all we need to navigate through these sites is the right gateway.
If you’re wondering when it is best to use which, wonder no more… 

Library Gateways are best when you are looking high quality information. Subject-Specific Databases are best when used to get, can you guess it? You’re right! Specific information on a subject!


Educator's Reference Desk (educational information)
Expedia (travel)
Jumbo Software (computer software)
Kelley Blue Book (car values)
Monster Board (jobs)
Motley Fool (personal investment)
MySimon (comparison shopping)
Roller Coaster Database (roller coasters)
Voice of the Shuttle (humanities research)
WebMD (health information)

Evaluating Web Pages



So maybe the first four chapters might have been a bit overwhelming because let’s be honest, although great, this information is a bit intimidating. Intimidating in the sense that we might not have been familiar with the concepts and even terminology so naturally that makes it a bit scary. Now this chapter gets a little bit more real and a whole lot easier to understand. Why? Because we will tell you how to determine if the information you’re gathering is being taken from reliable sources, because after all, you don’t want to get your description of dinosaurs from a website created by an aficionado who is NOT an expert in dinosaurs. Or do you?

The first step to success is to learn to read the website address. Every aspect of an address is crucial in determining how reliable the site is.

 Let’s start with the basics. Look at this url:

This is what everything means:
  • "http" means hypertext transfer protocol and refers to the format used to transfer and deal with information
  • "www" stands for World Wide Web and is the general name for the host server that supports text, graphics, sound files, etc. (It is not an essential part of the address, and some sites choose not to use it)
  • "sc" is the second-level domain name and usually designates the server's location, in this case, the University of South Carolina
  • "edu" is the top-level domain name (see below)
  • "beaufort" is the directory name
  • "library" is the sub-directory name
  • "pages" and "bones" are folder and sub-folder names
  • the second "bones" is the file name
  • "shtml" is the file type extension and, in this case, stands for "scripted hypertext mark-up language" (that's the language the computer reads).  The addition of the "s" indicates that the server will scan the page for commands that require additional insertion before the page is sent to the user.
These are the domains that are currently recognized:
  • .edu -- educational site (usually a university or college)
  • .com -- commercial business site
  • .gov -- U.S. governmental/non-military site
  • .mil -- U.S. military sites and agencies
  • .net -- networks, internet service providers, organizations
  • .org -- U.S. non-profit organizations and others
These are the new domains that are either starting to be taken into effect, or will soon:
  • .aero -- restricted use by air transportation industry
  • .biz -- general use by businesses
  • .coop -- restricted use by cooperatives
  • .info -- general use by both commercial and non-commercial sites
  • .museum -- restricted use by museums
  • .name -- general use by individuals
  • .pro -- restricted use by certified professionals and professional entities


DETERMINING PAGE AUTHORSHIP

You obviously need to know where the information is coming from and most importantly, who is putting it out there. So first, you have to learn about the author/publisher because you need to know what their views/opinions/purpose, etc. are founded on. So here is what you have to ask  yourself:

1.      Who is responsible for the page you are accessing? Is it a governmental agency or other official source? A university? A business, corporation or other commercial interest? An individual?

It is safe to say that you can trust the GOV and EDU hostnames to present accurate information. 

The NET, ORG, MIL, and COM domains are more likely to host pages with their own personal or organizational agendas and might require additional verification.

CHECKING THE VITAL INFORMATION
A trustworthy Web page will more than likely provide you with this information:
  • Last date page updated
  • Mail-to link for questions, comments
  • Name, address, telephone number, and email address of page owner
Now ask yourself this: If the page owner is not readily recognizable, does he provide you with credentials or some information on his sources or authority?

CHECKING THE CONTENT

If it’s on the Internet, then it must be true! Right? Wrong! You have to be careful in disseminating the information you’re getting. It is safe to assume that scholarly books and journal articles are reviewed, but who reviews the websites or checks for biases? Can the information you are finding be verified? Also, it is important to consider how often the information is updated. What may have been posted yesterday may be changed tomorrow. Check!

Creating a Search Strategy

Now that you have an idea of what Web Pages are reliable and which ones are not, you need to create a plan. And obviously in order to do this, you must know what your purpose is. What are you looking for?

Do you want to:
  1. Browse?
  2. Locate a specific piece of information?
  3. Retrieve everything I can on the subject?
Your answer will determine how you conduct your search and what tools you will use, and also, how you word your searches. 
  1. If you're browsing and trying to determine what's available in your subject area, start out by selecting a subject directory like Yahoo! Then, enter your search keyword(s) into one of the metasearch engines, such as Vivisimo, just to see what's out there.
  2. If you're looking for a specific piece of information, go to a major search engine such as Google, or to a specialized database such as Bureau of the Census (for statistics).
  3. If you want to retrieve everything you can on a subject, try the same search on several search engines. Also, don't forget to check resources off the Web, such as books, newspapers, journals and other print reference sources.
Now here is the tricky stuff…
If you are not specific, these engines can add the words “and” or “or” to link your words together. For obvious reasons, this alters your results and you might not get what you are looking for. Sometimes the words can be ignored and the engine recognizes words separately and the results are irrelevant and ineffective. There is a list of words known as “stop” words and are usually cut out to cut down response time. These words can be “a, about, an, and, are, as, at, be, by, from, how, I, in, is, it, of, on, that, the, this, we, what, when, where, which, with, etc.” If for example the phrase you are looking for has to have one of these stop words, you might want to consider using “quotations” around them.



The following are effective search statements:
CREATING A SEARCH STATEMENT
  • Be specific
        EXAMPLE:    Hurricane Hugo
     
  • Whenever possible, use nouns and objects as keywords
        EXAMPLE:    fiesta dinnerware plates cups saucers
     
  • Put most important terms first in your keyword list; to ensure that they will be searched, put a +sign in front of each one
        EXAMPLE:    +hybrid +electric +gas +vehicles
     
  • Use at least three keywords in your query
        EXAMPLE:    interaction vitamins drugs
     
  • Combine keywords, whenever possible, into phrases
        EXAMPLE:    "search engine tutorial"
     
  • Avoid common words, e.g., water, unless they're part of a phrase
        EXAMPLE:    "bottled water"
     
  • Think about words you'd expect to find in the body of the page, and use them as keywords
        EXAMPLE:    anorexia bulimia eating disorder
     
  • Write down your search statement and revise it before you type it into a search engine query box
        EXAMPLE:   +"south carolina" +"financial aid" +applications  +grants