| |
| |
1 |
2 |
3 |
4 |
5 |
6 |
| 7 |
8 |
9 |
10 |
11 |
12 |
13 |
| 14 |
15 |
16 |
17 |
18 |
19 |
20 |
| 21 |
22 |
23 |
24 |
25 |
26 |
27 |
| 28 |
29 |
30 |
|
|
Oct Dec
|
|
|
Microsoft's new search engine
(by Tom Wilson, posted at 11:18 AM)
There's been a burst of interest on the Net about Microsoft's new search technology, which can be found in beta form at MSN Search, but it doesn't look all that great to me.
"The release of our beta is a huge step towards delivering the information consumers are looking for online, faster.", says a Microsoft spokesman. However, my test is where Information Research appears when I search for it and, on this basis, MSN Search lags behind others. For example, when I used "information research", the Weblog was the first thing to appear—at the bottom of the first page of results. The journal site didn't appear until page six, when it was the last item on the page. One issue with MSN Search is that it appears to ignore the word order— there seemed to be as many occurrences of "research information" as of the phrase "information research", which doesn't seem very intelligent to me.
For comparison, here's a table of results with other search engines
| Search engine | Page number | Non-sponsored position |
| Alltheweb | 1 | 2 |
| Alta Vista | 1 | 1 |
| AOL Search | 1 | 1 |
| Ask Jeeves | 1 | 3 |
| Excite | 1 | 1 |
| Gigablast | 1 | 1 |
| Google | 1 | 1 |
| HotBot | 1 | 1 |
| Lycos | 1 | 2 |
| Teoma | 1 | 3 |
| Yahoo | 1 | 2 |
By this little test, the new MSN engine doesn't show up very well!
|
Odds and ends
(by Tom Wilson, posted at 11:00 PM)
Here's an interesting little item on Google.
TechWeb Today points to a new TechEncyclopedia, with 20,000 terms. Curiously, this doesn't display correctly in FireFox, although when I download the page to look at the code, the downloaded version displays perfectly well. Something strange going on here!
|
Weblogs and other things
(by Tom Wilson, posted at 9:38 PM)
Weblogs
My thanks to folks, on and off the Weblog, who've written to encourage me to keep the Weblog going—I'll plod on when I know that it has some effect. Carol Cahill kindly says:
Our library probably wouldn't have a wireless Internet connection if my interest hadn't first been piqued by your Weblog. Now we have a four-laptop wireless training lab and patrons can come in and connect with their own computers.
Which I think is rather better than a citation in a journal :-)
"The Chief's" comments on Weblog membership counts is also interesting - as are the usage stats for the Weblog - last year 13,776 hits, this year, so far, 13,588 with those hits distributed over the continents as follows:
| 1. | North-America | 10,780 | 39.4% |
| 2. | Europe | 10,565 | 38.6% |
| 3. | Asia | 2,961 | 10.8% |
| 4. | Australia | 1,735 | 6.3% |
| 5. | Africa | 415 | 1.5% |
| 6. | South America | 277 | 1.0% |
| 7. | Central America | 133 | 0.5% |
| | Unknown | 498 | 1.8% |
Yahoo! does a Google
News today of Yahoo!'s purchase of an e-mail start-up, by the name of Bloomba (why does the Internet generate so many silly names? Scope for a PhD dissertation here!). I'd never heard of Bloomba before, but it is an e-mail client, rather than a Web-based service. Reviews suggest that its killer feature is its search capacity; it indexes your mail as you receive it, including what's in attachments. Whatever plans Yahoo! has for the system, no one seems to know. The original parent company, Statalabs, says:
What does Yahoo! plan to do with the technology as a result of the acquisition?
At this time we do not have any announcements about the ongoing plans for the technology or the specifics of the transaction.
A case of 'Watch this space' - well, not this one, since I can't guarantee that I'll spot an announcement, but perhaps the Yahoo! site - and while you are there, you might like to take a look at MySearch
|
Odds and ends
(by Tom Wilson, posted at 10:35 AM)
The Weblog
It seems that my suspicions about the lack of general interest in the IR Weblog are confirmed :-) I've been contributing very little over the past month and so far no one has asked, Where are you?
New issue of the journal
The latest issue, Volume 10 No. 1, is now on the site. This one has the first batch of papers from the Information Seeking in Context conference, held in Dublin last month. The other half will be published in the January 2005 issue. I finally got round to checking on what logs were available on the server and discovered that, since, the 8th October (which is when the analysis software appears to have kicked in) there have been about 280,000 hits on the InformationR.net site - most of which are on the journal. This is considerably beyond my own estimates from the various counters. InformationR.net is the sixth most 'popular' virtual domain on the University's servers.
Voice over Internet Protocol (VoIP)
VoIP appears to be building up nicely. I finally got round to using it, along with colleagues in the AIMTech research group at Leeds University Business School. The voice quality, using Skype, is generally pretty good - not quite as good as the best landline, but good enough considering that it's free. I've also tried the SkypeOut service, which connects to landline numbers pretty well anywhere in the world and to mobile phones in some. You can connect to landlines in Western Europe, North America, Australia and New Zealand for 1.7 Euro cents a minute (£0.0118 or $0.02129) - mobiles cost a good deal more. Connection with landlines can be variable - sometimes connection is lost and in one case there was no voice connection at all. No doubt, with the interest being expressed, these problems will get ironed out.
Of course, governments and the big telecomms companies get very edgy over VoIP - here's a communication process where they may not be able to make any money, unless they REGULATE. Naturally, it is the USA where these concerns are raised.
It had to happen: "Boingo, Vonage Sign VoWi-Fi Pact"
Google again
A couple of things about Google - first, you'll find a review of its e-mail service, Gmail, in the latest issue of the journal. Secondly, I'm also trying out its 'desktop search' program - this enables you to do a Google search on your hard disc. It also checks your hard disc when you do a Web search - useful for bringing to your attention those items you'd forgotten you'd ever written!
|
Google in the news
(by Tom Wilson, posted at 4:44 PM)
Google is in the news again - on the 5th October it issued a 'new features' message to users of Gmail, to the effect that it was trialling a new mail forwarding system, which would be free during the trial. This prompted commentators to speculate on what other features of Google in general would become revenue streams.
As it happens, I've been using Gmail as a beta user for the past couple of months and a review will appear in the October issue of Information Research, and I'm now hooked on it. It's 1Gb filestore, use of 'labels' to index messages, and grouping of messages into 'conversations' make it a real winner.
|
New book
(by Tom Wilson, posted at 4:18 PM)
Congratulations to one of our Editorial Board members, Amanda Spink, for her new book, jointly authored with Bernard Jansen: "Web Search: Public Searching of the Web" - you can find details at the publisher's Website.
|
Popular papers in Information Research
(by Tom Wilson, posted at 8:42 PM)
Having recently published a new issue of Information Research, I thought it was time to find out how the ranking by 'hits per month' was standing. So here's the latest table. We see that some very recent papers appear to have struck a chord, while some of the oldest papers are still going strong.
|
AI and search engines
(by Tom Wilson, posted at 2:11 PM)
A highly favourable item on a new search engine, blinkx in the Guardian Online supplement, sent me off to its Website to check it out. blinkx uses, so we are told, an AI technique rather than page-ranking a la Google and it searches not only the Web, news services, and Weblogs, but also your hard disc. From one of the file names on the downloaded system I suspect that the engine behind blinkx is Autonomy
The Website includes an option to try out the beta version of blinkx optimised for broadband users and I discovered something rather odd. The PR claims that "blinkx understands your question and presents you with links as you search." - but the system obviously uses stop words. How can a question be understood if the stop words include terms of significance to the user?
Specifically, I searched for 'Information Research', expecting the journal site to pop up fairly quickly - no: only things on 'research' appeared. Similarly, when I used 'information behaviour', only 'behaviour' was used as a search term, and for 'information science', only 'science'. Not much use in the information management sector, then! The give-away is that the terms used in the search are highlighted and in all cases, where 'information' did appear in an item, it was not highlighted.
'Information' on its own may or may not be a useful search term - certainly it would generate millions of hits, but when used in compounds such as those mentioned, the concept so formed has much greater specificity. As long as AI systems continue to fail to recognize concepts and their semantic significance, they will fail to produce a search system that is a significant improvement on Google.
|
Google
(by Tom Wilson, posted at 9:51 PM)
Google is also hitting the news this week - with new services announced and, in today's Guardian, a big article about Google's intention to offer a free e-mail service to compete with Yahoo! and Hotmail, offering a gigbyte of storage - way above the limits of the other two. I'll join that! Get more on this from Google itself.
|
A Google Game
(by Tom Wilson, posted at 4:35 PM)
There are all kinds of games you can play with Google, including the well-known 'Googlewhacking'. I don't know whether I've invented this one, which I discovered accidentally.
It is very simple: just hit a few keys haphazardly, for example, "l;kd" in the search box of Yahoo and see what turns up. The aim is to put in something that returns nothing - which is surprisingly difficult! That combination, for example, turned up more than one and half million hits! Even entered as a phrase, it produced almost 20,000.
The string ";we[kear'k" resulted in 34 hits, largely as a result of the existence of an author called "K. Kear". However, as a phrase, it produced zero - so it can be done. Remember, however, that they entry of symbols should be haphazard, just let your fingers do the choosing.
|
In the news...
(by Tom Wilson, posted at 10:30 AM)
An interesting item on wireless in the public library from LIS News.com
...and a longer piece on IT in public libraries from D-Lib Magazine
Turning to the University sector, I picked this up from Seb's Open Research - a couple of courses at Prince Edward Island University are using Weblogs as resource pages and communication. Here's one on 'Networking, knowledge and the digital age'.
And here's an interesting one! I initiated a debate on the JESSE list some time back on the extent to which Web citation was beginning to overtake journal citation as a performance tool. I then found that this had been picked up by a couple of researchers (Vaughan and Shaw, Bibliographic and Web citations: what is the difference? JASIST, 54(14), 2003, 1313-1322) and now ISI is getting together with NEC: Thomson ISI and NEC Team Up to Index Web-based Scholarship
PHILADELPHIA & LONDON & PRINCETON, N.J.--(BUSINESS WIRE)--Feb. 25, 2004--Today, Thomson ISI and NEC Laboratories America (NEC) announced their collaboration to create a comprehensive, multidisciplinary citation index for Web-based scholarly resources. The new Web Citation Index(TM) will combine a suite of technologies developed by NEC, including "autonomous citation indexing" tools from NEC's CiteSeer environment, with the capabilities underlying ISI Web of Knowledge(SM). Thomson ISI editors will carefully monitor the quality of this new resource to ensure all indexed material meets the Thomson ISI high-quality standards.
During 2004, Thomson ISI and NEC will operate a pilot of the new resource to receive feedback from the scientific and scholarly community. Full access to the index is projected for early 2005.
When fully operational, the new resource will be a unique content collection within ISI Web of Knowledge. It will complement the Thomson ISI Web of Science®, and provide researchers with a new gateway to discovery -- using citation relationships among Web-based documents, such as pre-prints, proceedings, and "open access" research publications
OK - that's enough for now - I've got to go off to talk with the people at Orange about mobile technologies.
|
Search engines and the FT
(by Tom Wilson, posted at 10:02 AM)
I didn't get to the Saturday issue of the FT before this morning and there I found a leader item on search engines. I don't think I've seen a newspaper leader on the subject in the UK before. The item is 'Online searching: who's feeling lucky?' - available on the FT web site, but only to subscribers. The main point about the article is the suggestion that with the limited number of search engines available, or rather, the dominance of Google, there's a need for 'one fully transparent search engine, preferably maintained in the academic realm.' Isn't it curious how the advocates of capitalism always find a role for the public sphere when they want something unbiased? :-) The suggestion was made originally by Google's founders, Sergey Brin and Larry Page, in a research paper, but I haven't been able to locate it on the Web.
Good luck to the FT, but the chances of any university in the UK picking up the challenge to provide a 'fully transparent search engine' are pretty remote. You can count on the fingers of one hand and still have spare capacity the number of institutions pursuing serious information retrieval research and so deeply mired in managerialism are the institutions that the probability of selfless public service is remote. Everything these days must have an 'income stream', nothing is done for nothing, and the tentacles of central government's assessment procedures stretch everywhere.
|
Search engines
(by Tom Wilson, posted at 7:27 PM)
Old news now - two days old - that Yahoo! has dropped Google as its search engine in favour of its own search engine, provided by Inktomi. So I wondered how it compared. I searched for "Information Research" using both and, surprise, links to the journal were 1st and 2nd in Yahoo! Search and also in Google. Not much difference there. So, I searched for "case-based reasoning" at ".edu" sites. In the first 20 links for each search engine, only five institutions were duplicated, and from these five institutions only four Web pages were duplicated. It would seem, then, that the two engines are doing different things and that, if you want a reasonably comprehensive coverage of a topic, it would be a good idea to use both.
|
NewzCrawler
(by Tom Wilson, posted at 2:45 PM)
Having used the news aggregator, NewzCrawler, for some months now, I finally decided, when the evaluation period came to an end, that I can't live without it - and the $24.95 seems a modest price to pay. It isn't perfect, but then what software is?
The need for a news aggregator, assuming that you still haven't cottoned on to the need, is the increasing popularity of RSS feeds that provide the raw material for aggregators. A recent development at Yahoo! makes RSS feeds available for news searches. For example, if you want to pick up every mention of Tony Blair (heaven forfend) that occurs in the news sources covered by Yahoo!, use this URL in your aggregator. Read about this development at Jeremy Zawodny's Weblog
My aggregator now has links to fifty news and information sources - it's continually growing and continually being weeded as I find new things and get rid of dross - of which there is much!
Search engines
(by Tom Wilson, posted at 2:20 PM)
There's a useful account of developments in search engines during 2003 at Sitepoint - I was pointed there from the Logos Weblog, which has some interesting stuff. I liked this comment from the section on the future:
Watch Microsoft carefully. If a new Microsoft-based search initiative gets off the ground this year, you can bet it will be well funded and well promoted. Site owners can benefit from first-mover advantages in getting listed. If you can become an early expert in the new search technology, your site and traffic could soar.
|
Stuff you don't need to know
(by Tom Wilson, posted at 5:59 PM)
As everyone knows, Information Research uses the Atomz.com search engine - which is made freely available. I've just been experimenting with restricting the pages that are scanned, but it didn't work out. However, in the process I had to ask for the site to be re-indexed (this normally happens automatically every Sunday night) and the log for the indexing tells me that 417 pages have been indexed containing 1,546,605 words.
Wow - 1.5 million words - I had no idea that we'd published as much as that. Now, that includes contents pages (which I was trying to mask) and the editorials but, nevertheless, that's a lot of words. And, given the volume of hits, it seems that people find them useful words.
While I was at it, I checked on the language used in the searching: here are the search strings used last month:
| Frequency | Search string |
| 18 | knowledge management |
| 12 | information management |
| 11 | data mining |
| 8 | electronic resources |
| 8 | information conciousness |
| 7 | cko and failure |
| 7 | communication |
| 7 | digital library |
| 7 | information retrieval |
| 7 | management |
| 6 | cko |
| 6 | e publishing |
| 6 | information literacy |
| 6 | information seeking behaviour |
| 6 | online public access catalog |
| 6 | outsourcing |
| 6 | pattern of communication adopted by marketing department in industrial goods sec |
| 6 | search engine |
| 5 | company libraries |
Some of this strikes me as odd and must be the result of some people clicking on the 'Go' button more than once.
|
Google and InformationR.net
(by Tom Wilson, posted at 5:54 PM)
As a result of getting O'Reilly's 'Google hacks' for review, I've tweaked one of the examples to provide a site-search feature for InformationR.net. Try it out and let me know what you think about it.
|
Hitting the site...
(by Tom Wilson, posted at 7:48 PM)
I imagine that most readers of Information Research will be aware of the counter on the top page. What they may not know is that I regularly collect information from the counter service on where the hits are coming from. My 'harvest' now totals 4,158 hits - collected since 1 November 2002 - and shows hits arriving at the site from referring sites (almost 500 of them). Only a few sites account for 2.0% or more of the 4,158 and I show them in the table below.
It's a curious list consisting of a variety of organized resource 'directories', like BUBL, together with one other e-journal, a academic site hosted by the Department of Communication at the University of Washington, the search engine, Google, and one item in a newsletter about search engines.
The last of these - Searchday from Search Engine Watch - demonstrated the impact of certain sources: the item was published on 27 May 2003 and it immediately led to a peak in the hits curve, and hits from that page have been arriving ever since, to they effect that it now accounts for 2.5% of all the hits on the top page.
The Directory of Open Access Journals also illustrates how a new site can have an immediate impact on traffic - I don't recall when the hits first appeared, but it was only earlier this year, and it now accounts for almost 3% of the total.
The data on Google are a bit of a cheat - in fact, if one takes all 28 Google sites (from www.google.ae to www.google.sk, the search engine in its different manifestations accounts for 7.55% of the total hits.
Searching and the Weblog
(by Prof. Tom Wilson, posted at 5:02 PM)
Now there's a funny thing - it seems that a number of people are hitting the
Weblog through searching for something completely different, and yet decide to
have a look.
For example, someone searching for 'Captain Stabbin' on msn.com found the link
to my item on the Nigerian scam at number 28 on the output list - yet still
clicked on the link to the log. I can't imagine the naive user doing that, so I
assume that it must have been someone who recalled seeing my message on the log
and wanted to find it again.
Similarly, someone searching for 'Internet 2' on Google found the Weblog link at
number eight in the list, yet followed it. And another Google search for 'Joint
use libraries' AND 'Syllabus', resulted in 14 items, of which the Weblog item
was number 13 - and yet that was followed. Surely cases of people wanting to
find things they'd seen before.
It seems unlikely, however, that someone searching for "What is the official
name from the standards organization of the 11mbs wireless networking
standard?" again on Google, would have seen a specific message on this topic.
In fact, the link to the Weblog was number two on the list and led to the
'Wireless' channel of the log - nothing there to answer the question.
Most curious of all, however, was a search for 'sugar daddy phenomenon' - and,
lo and behold!, the Weblog item that includes all three words is item number
two - this time the 'Electronic publishing' channel of the log, which includes
an item on the open access 'phenomenon', posted on 12 September 2003, which
included a request for a 'sugar daddy' to support the journal.
Curious indeed are the ways of search engines and people - you can check this
out at the
counter service.
Incidentally, of 21 hits from search engine search outputs, 16 used some variant
of Google.
Thinking of buying Google?
(by Tom Wilson, posted at 4:10 PM)
Check out the Fortune article.
|
Information Research and SSIC
(by Tom Wilson, posted at 3:32 PM)
I've just taken the time to check Web of Science and it seems that all items in Information Research from Volume 8 no. 1 have now been indexed there. I look forward to every increasing hits :-) Speaking of which... the current hits on the top page now exceed last year's total by more than 10,000
|
Pricing Google
(by Tom Wilson, posted at 5:29 PM)
The possibility of privately-owned Google going public is giving financial analysts the trembles.
Wharton School of Management has a nice piece on it.
|
Odds and ends
(by Tom Wilson, posted at 1:12 PM)
Current Cites is an electronic publication I've drawn attention to before. Here are a couple of items that interested me:
I'm in the process of reviewing the latest version of EndNote, the bibliography organizer, and this version has a new feature, linking to the original source through the OpenURL protocol - coincidentally, Current Cites draws attention to an interview in the OCLC Newsletter with Herbert Van de Sompel, the originator of the protocol and a key figure in the Open Archives Initiative
The other piece is from First Monday that e-journal that is just a little younger than Information Research :-) This paper concerns 'open content' - that is, what you are reading now, and what you read in every new issue of Information Research. Magnus Cedergren, the author of 'Open content and value creation' states in the abstract:
In this paper, I consider open content as an important development track in the media landscape of tomorrow. I define open content as content possible for others to improve and redistribute and/or content that is produced without any consideration of immediate financial reward often collectively within a virtual community. The open content phenomenon can to some extent be compared to the phenomenon of open source. Production within a virtual community is one possible source of open content. Another possible source is content in the public domain. This could be sound, pictures, movies or texts that have no copyright, in legal terms.
and in the body of the paper he looks at three examples of open content:
All in all, an interesting paper.
|
The Visual Thesaurus
(by Tom Wilson, posted at 1:45 PM)
John Holgate has drawn the attention of IR-Discuss members to the:
Plumb Visual Thesaurus developed since 1996 in the Princeton University Concept Labs. IMO it's the biggest breakthrough in semantics since Carnap invented 'intension'.
It is interesting that the VT's 'view' of the concept information comes directly from
the Princetonian definition:
'a message sent and received that reduces the receiver's uncertainty' (ho hum)
but it also separates out facts/documents/data from 'selective information'
(a la Shannon communication theory) and the entropy/ectropy strand beloved
of the physicists.
The strange little entity labelled 'info', which is appearing more and more in
biology circles is, perhaps fittingly, without a definition.
I suggest you try playing with 'knowledge' and 'experience' for good measure
and see how meanings appear to have their own momentum and relationships -
like in the world beyond thesauri and dictionaries.
Thanks for that, John.
|
More odds and ends
(by Tom Wilson, posted at 9:48 PM)
Grahame Gould drew my attention to the fact that the Free-Conversant server has been down and the last lot of 'Odds and ends' was not reachable - so, as compensation, here's another lot. I've no idea why the server was down, not having had any information about it.
Music piracy is hitting the headlines again. Regular aficionados of this site may recall an earlier message on the subject and today the music industry won plaudits for its suing of a 71-year old grandfather and a 12-year old child. That's the way to do it, guys - go for the soft targets. Naturally, it has been picked up by the other Weblogs and The Shifted Librarian raises a point or two.
The whole thing makes another item from Techdirt all the more interesting: apparently the music industry is using file-sharing networks it abhors to collect market research data.
On the search front, there's a rather curious hybrid at Anacubis, described as an integration of :
...the Amazon and Google search APIs with the anacubisTM Viewer to deliver an innovative and powerful new way to browse the extensive catalogue of books, CD, DVDs and videos for sale at Amazon.com - and then explore related information amongst Google's 3 billion plus web documents
The demo worked fine the first time I used it, but refused to perform again. Try it out, however, you never know your luck. I'm not sure who it is intended for - perhaps simply to show that the Anacuba visualisation software works - but I'm always chary of visualisation of searches, given the way people search and the limited responses they are happy with. Pictures are not always worth a thousand words. Thanks to ResearchBuzz for that one.
|
Google again
(by Tom Wilson, posted at 2:38 PM)
News from Search Engine Watch on the issue of who's got the biggest index. [Perhaps there'll be a new burst of spam - 'Proven ways to increase your index size!!!!!!!!!']
As Danny Sullivan, the author, says:
Size figures have long been used as a surrogate for the missing relevancy figures that the search engine industry as a whole has failed to provide. Size figures are also a bad surrogate, because more pages in no way guarantees better results.
However, it's easy to use pages indexed (even if you aren't telling the truth - see the article) in the publicity battle, so I guess it will keep on going.
|
Monday morning
(by Tom Wilson, posted at 8:09 AM)
Here's an interesting item from Current Cites - that useful alerting system for things about information and information technology: it concerns a new book from O'Reilly on 'Amazon Hacks', describing the tricks you can get up to in searching for books using the very powerful search engine at Amazon.com
This particular 'hack' discusses the advanced search possibilities, which go well beyond the typical Boolean search. Read all about it at the book's Web site.
On the Weblogging front, there's a dispute brewing up about RSS - Really Simple Syndication, or whatever use you want to make of those initials. The dispute surrounds the future of RSS and is too complicated to summarise here, so go look at the CNET News site.
|
Various
(by Tom Wilson, posted at 1:16 PM)
It's been a while since I posted to the log as I'm in Sweden and have been for the past week and too busy to give time to it.
I've also been experiencing server problems - unable to access my Webmail box at Sheffield for the past couple of days, so people may have been trying to contact me with my knowing. My Swedish address will serve for anyone who has been trying to reach me - "tom.wilson@hb.se"
I assume that many of you have been infected by the SoBig virus - I received a message from one correspondent saying that he had had 700 messages in one morning. I don't think I had that many, but I certainly had several hundred over the course of last week. It is no comfort to learn (from BBC News) that this has been the fastest proliferating virus of all time.
News on the search front today: my last entry related to Overture and now we learn (from CNET news) that Google has expanded its index beyond the 3.2 billion pages claimed by Overture. As the report says:
But since then, Mountain View, Calif.-based Google has quietly leaped ahead again, expanding its database to more than 3.3 billion Web documents by Thursday this week, according to its home page. A Google representative confirmed the change.
"Google raised the number on its home page to accurately reflect the number of Web pages it offers consumers," a representative wrote in an e-mail. The search company's worldwide index now includes 3.3 billion Web documents, 800 million Usenet pages and 400 million images.
On another front, the legal system hit a new high in the UK this week as a result of the Hutton Enquiry. Its Web site is attracting 'upwards of 80,000 visitors a day', according to the Guardian's Online supplement. The transcripts of the hearings into the circumstances surrounding the death of David Kelly make fascinating reading as politicians, their public relations staff and journalists dance around the questions put. The big news, of course, related to Tony Blair's appearance before the Inquiry earlier this week - the jury is out on that performance but from what I read it was an assured performance with all the glibness of which the man is capable - whether anything he says these days can be trusted, is another matter, and the polls suggest that the public appreciation of him has waned considerably.
There news and screenshots of the latest versions of MSoft's new (three years down the road?) operating system, code-named Longhorn, at WinSupersite.com. The thrust appears to be more and more towards multimedia integration - so I guess that's another zillion features that the typical user will make little us of!
Enough for now! Have a good week-end
|
Overture search engine
(by Tom Wilson, posted at 2:52 PM)
The Overture search engine, bought by Yahoo!, now has an index, courtesy of FAST - also bought by Yahoo! - of, it is claimed, more than 3.2 billion Web pages. (News from Research Buzz)
Ah, but can one find "Information Research", you ask? Well, it seems that it can. My usual test is to see whether the journal comes in the top two or three when searched for as a phrase or as just the two words. Overture turns up trumps - IR is the first listed 'additional' site. The first site returned is always a sponsored site, i.e., one that is paying to be listed.
There's a twist, however, the IR index page only comes up number 2 on the US listing. If one selects UK as the country, there are three 'sponsored' sites and the fourth site is the redirection page for IR on the Department of Information Studies site at Sheffield. One can play games like this all day: when I searched from the Netherlands page, the first mention of the journal was at number 3, but that was the catalogue entry at the Royal Library. From the Japan page I found no mention at all. Obviously, these country pages cover sites in the country, rather than international sites - except for the USA, which, I suppose, is thought to be international?
The Research Buzz item asks how Overture is going to wean people off Google - good question. Yahoo! has spent a lot of money acquiring search capabilities so, presumably, it must have a cunning plan.
|
News about Northern Light
(by Tom Wilson, posted at 4:08 PM)
It seems that Northern Light is planning to bring back its public-use search engine. A note on the Web site says:
If you're looking for the Northern Light web search engine, it is not currently open to the public. We are planning to bring it back later this year. If you would like to be notified when it is available again please sign up for our mailing list.
I used to use Northern Light quite a lot, but it pulled out of the public-use market, for some reason or another and concentrated on seeling its search engine to corporations. Given how the search engine arena has changed since Northern Light 'died' I wonder what motivates the decision to relaunch.
I spotted this item on the ResearchBuzz weblog
|
Hitting on Google.
(by Tom Wilson, posted at 10:19 PM)
Slate has an article, Digging for Google Holes, that is critical of various aspects of searching Google, some of which seems pretty lame to me. For example, it is claimed that synonyms are a problem (big news, aren't they for everybody!?) and cites the fact that:
Search for apple on Google, and you have to troll through a couple pages of results before you get anything not directly related to Apple Computer and its a page promoting a public TV show called Newtons Apple. After that its all Mac-related links until Fiona Apples home page. You have to sift through 50 results before you reach a link that deals with apples that grow on trees: the home page for the Washington State Apple Growers Association.
Presumably the writer is someone who rarely searches: I put 'apples "Washington State"' into Google and the Washington State Apple Commission came up first, with no trace of an Apple Computer. Shouldn't journalists who write on search engines try to learn something about them?
|
Google and Weblogs
(by Tom Wilson, posted at 8:24 AM)
The relationship between Weblogs and Google seems to be in the news again. The Register carries an article (again by Andrew Orlowski, who seems to have a thing about Weblogs) on the subject, claiming that searchers are fed up with links to Weblogs cluttering up their search results. I'm not sure about the circumstances under which this occurs, since I have yet to experience the phenomenon - probably means I'm just searching for serious, boring stuff rather than the latest gossip about Madonna or whoever...
The subject is also tackled in a recent article in The Observer. In it, John Naughton suggests that it is all a matter of the professional journalists envying the amateur and he points out that much of the stuff written by the professional hack is not available on the Web. His moral?
The moral is: if you want to score with Google, be on the web. Otherwise, go whistle.
That seems fair!
|
Google fictions?
(by Tom Wilson, posted at 2:52 PM)
Here's a complicated story. Some time back Andrew Orlowski of The Register published a story about Google dropping Weblogs from its searches and, instead, using a special-purpose search engine. I commented on this at the time. A story in Monday's Guardian Online suggests that this is not the case. It all leaves one wondering what is reliable on the Web and in Weblogs. This message is, because I'm just pointing you to pages that exist :-)
|

This work is licensed under a
Creative Commons License.
This site managed with Conversant, © Copyright 2008 Macrobyte Resources
|
Channels
Digital Libraries
Education
Electronic publishing
Freedom of information
Information Management
Intellectual Property
Internet
Knowledge management
Personal
Records management
Resources
Searching
Software
Technology
Weblogs
Wireless
Words
|