| 研究生: |
王仕榮 Shih-Yong Wang |
|---|---|
| 論文名稱: |
具檔案敘述相關語查詢之智慧型檔案搜尋系統 SmartArchie: An Intelligent File Search System |
| 指導教授: |
曾黎明
Li-Ming Tseng |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 畢業學年度: | 88 |
| 語文別: | 中文 |
| 論文頁數: | 58 |
| 中文關鍵詞: | 檔案敘述 、檔案下載 、全球資訊網 、代理伺服器 、檔案傳輸 、資源再利用 |
| 外文關鍵詞: | description, file retrieval, Archie, World Wild Web, proxy, FTP, resource reuse |
| 相關次數: | 點閱:14 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
眾所皆知,檔案下載佔去了很大部份的網路頻寬,但要找到心目中的軟體是放置在何處並不容易,
於是一些著名或國外的站台便成了許多使用者的理想去處,不過盡往外連的結果就無法節省寶貴的網路資源。
Archie 是網際網路上常見的服務,使用者可藉由檔名字串來作檔案的搜尋,只是 Archie 沒有跟上時代變化的
腳步,因為有愈來愈多的軟體現在只能在 WWW的網站上見到,但 Archie 卻只能搜尋放在 FTP 站台內的檔案。不過,一般 WWW 的搜尋引擎在做檔案搜尋的工作上也不盡人意。
在本論文中,我們實作了一個具檔案敘述相關語查詢的智慧型檔案搜尋系統,稱作 SmartArchie。
使用者可以藉由檔名字串或是檔案敘述相關語來搜尋想要的檔案,無論這個檔案是置於網際網路上的 FTP 站
或是 WWW 網站。而且本系統與代理伺服器相互合作,因此 SmartArchie 可以有效提高檔案的重覆下載率
及整體的命中率。此外,本系統亦是完全中文相容的。
經過我們長期的觀察,結果顯示已經有 20% 以上的使用者藉由檔案敘述相關語來搜尋檔案。
It is believed that file retrieve accounts for a large percentage of the network traffic. However, there can be great
difficulty in finding desired software and then people always fetch them from well-known foreign sites.
Thus saving in bandwidth and reusing resource is almost not guaranteed.
The Archie service issues the problem of locating files by their attributes, which are always the filename, on the Internet. But Archie lost the sight of more and more software packages are stored on the Web sites, therefore it can deal with only archives within the FTP sites. In addition, most WWW search engines cannot perform well on filename search or locate
the file by some describing words.
In this paper, we provide an intelligent file search system, called SmartArchie, for Internet archive discovery. It allows searching file not only by the filename but also the description relevant to the file. It also allows people to locate inquired file, no matter it is on public FTP hosts or Web sites. In cooperating with proxy caching server, the SmartArchie successfully increases the ratio of archive reuse and the request and byte hit rate. Finally, indexing and searching in Chinese is also supported.
Our results demonstrate that there are about 20% of users search file by description.
[1]Emtage and P. Deutsch, "Archie: An Electronic Directory Service for the Internet," in Proc. Winter 1992 Usenix Conf.,
Usenix, Sunset Beach, Calif., pp. 93-110, 1992.
[2]"The Archie 3.5 System Manual," Bunyip Information Systems, Inc., 1996.
[3]Lee-Feng Chien, Sung-Chien Lin, Jenn-Chau Hong, and Ming-Chiuan Chen, "Internet Chinese Information Retrieval
Using Unconstrained Mandarin Speech Queries Based on A Client-Server Architecture and A PAT-tree-based
Language Model," in IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2,
pp. 1155-1158, 1997.
[4]Katia Obraczka, Peter B. Danzig, and Shih-Hao Li, "Internet Resource Discovery Services," IEEE Computer,
pp. 8-22, Sep. 1993.
[5]Tao Guan and Kam-Fai Wong, "KPS: A Web Information Mining Algorithm," Computer Networks, vol. 31,
pp. 1495-1507, 1999.
[7]Steve Lawrence and C. Lee Giles, "Searching the Web: General and Scientific Information Access,"
IEEE Communication Magazine, pp. 116-122, Jan. 1999.
[8]Chia-Hui Chang and Ching-Chi Hsu, "Integrating Query Expansion and Conceptual Relevance Feedback For
Personalized Web Information Retrieval," Computer Networks and ISDN Systems, vol. 30, pp. 621-623, 1998.
[9]Udi Manber and Sun Wu, "GLIMPSE: A Tool to Search Through Entire File System,"
in Winter USENIX Technical Conference, 1994.
[10]Jia Wang, "A Survey of Web Caching Schemes for The Internet," Computer Communication Review of
ACM SIGCOMM, vol. 29, pp. 36-46, Oct. 1999.
[11]K.L.E. Law, B. Nandy, A. Chapman, "A Scalable and Distributed WWW Proxy System," in IEEE Multimedia
Computing and Systems Conference, pp. 565-571, 1997.
[12]Martin Arlitt, Ludmila Cherkasova, John Dilley, Rich Friedrich, and Tai Jin, "Evaluation Content Management
Techniques for Web Proxy Caches," HP Labs Technical Reports, 1999. (available in
http://www.hpl.hp.com/techreports/1999/HPL-1999-69.html)
[13]Hsin-Yi Lu, "The Mirror-On-Daemon Server," Distributed System Laboratory, Department of Computer Science
and Information Engineering, National Central University, Jungli, Taiwan, ROC, pp. 33-35, Jun. 1999.
[14]Mark Russell and Tim Hopkins, "CFTP: a Caching FTP Server," Computer Networks and ISDN Systems,
pp. 2211-2222, 1998.
[15]Venkat N. Gudivada, Vijay V. Raghavan, William I. Grosky, and Rajesh Kasanagottu, "Information Retrieval
On The World Wide Web," IEEE Internet Computing, pp. 58-68, Sep. 1997.
[16]H.Vernon Leighton and Jaideep Srivastava, "Precision among World Wide Web Search Services (Search Engines):
Alta Vista, Excite, Hotbot, Infoseek, Lycos," Jun. 1997. (available in http://www.winona.msus.edu/
library/webind2/webind2.htm)