În desfăşurare

webpage categorization program

This assignment asks you to create a web page categorization program.

• The program reads 20 (or more) web pages. The urls for some of these web pages can be maintained in a control file that is read when the program starts. The others should be links from these pages. (Wikipedia is recommended source.) For each page, the program maintains frequencies of words along with any other related information that you choose.

• The user can enter any other URL, and the program reports which other known page is most closely related, using a similarity metric of your choosing.

The implementation restrictions are:

• Create a cache based on a custom hash table class you implement to keep track of pages that have not been modified since accessed; keep them in local files.

• Use library collections or your own data structures for all other data stores. Read through the Collections tutorial.

• Establish a similarity metric. This must be in part based on word-frequencies, but may include other attributes. If you follow the recommended approach of hash-based TF-IDF, create a hash table storing these.

• A GUI allows a user to indicate one entity, and displays one or more similar ones.

The presentation details are up to you. Use Swing, JavaFX, or Android components for the GUI.

This extends Assignment 1 using persistent data structures and additional similarity metrics. It requires two programs.


• For each of at least 100 URLs, create a persistent file-based B-Tree (or B+-Tree) containing frequencies and/or other information for hash-based key representations.

• Load each B-tree with frequencies (possibly along with other data) extending or changing those in assignment 1 if applicable.

• Implement and use a fixed size buffer cache to reduce IO.

• Pre-categorize pages into 5 to 10 clusters using k-means, k-mediods, or a similar metric.


Extend Assignment 1 to display a category (cluster) and most similar key from the above data structures.

Aptitudini: Coding, Java

Vezi mai multe: website miscategorized, website categorization python, web categorisation api, similarweb api, how to categorize my domain, open source url categorization database, website classification dataset, website categorization api, design implement program graphically displays processing selection sort, how much does it cost to develop an android game, how much does it cost to build an android app, how much does it cost to make an android app, i found a good translator check it out https play google com store apps details id com nyxcore genlang, translate it into urdu my mother she is beautiful softened at the edges and tempered with a spine of steel i want to grow old an, webpage designer program, Write an MPI program in which two threads play the game rock, paper, scissors, and the master thread plays the role of a referee, Write an assembly language program that allows a user to enter any 5 numbers then display the sum of the entered 5 numbers, Windows 10 program that displays random characters on the screen, how long did it take you to learn android programming, a partnership agreement is binding even if it is not in writing

Despre angajator:
( 0 recenzii ) Oswego, United States

ID Proiect: #21931070

1 freelancer licitează în medie 20$ pentru acest proiect


hi i have the capabilities to do the project .and i'm glad to work with you sir.i will deliver the project as soon as possible

%bids___i_sum_sub_32%%project_currencyDetails_sign_sub_33% USD în 1 zi
(0 recenzii)