~ Link Sorter ~Link sorter is Windows (Framework 4) utility that allows to sort links according to semantic proximity to a given text sample. It supports 14 languages
How it worksUser enters list of links, provides wanted or searched text sample, selects language and two optional parameters. One of them is usage of stemming and another is filtering of stop words. Stemming is converting multiple word forms into simple one as "went", "goes" and "going" into "go". And stop words are meaningless frequently using words such as "any", "next", "another" and so on. Obviously, stemming and stop words filtering is written for a particular language and, on that reason, selection of correct language is critical.
For a quick test I added the self test button. It adds links, sets options and user needs only to start the processing.
The result will be saved into an HTML file and shown in the browser.
In order to collect links easy I also provided Firefox extension LinkReader, so user can use any search engine such as Google or Bing to collect the links. Yahoo does not show the links, so not all search engines can be used with my extension.
Formats and structuresWhen LinkSorter is unzipped it has only two files. One is DLL (LinkSorter.DLL) and another is executable (SearchUtility.exe). When links are processed they are saved into file "result.html" in the same folder with executable. Links can be saved and read from the text file. The format is obvious from the example below
--valid links-- https://www.blogger.com/?tab=wj http://www.linternaute.com/ville/paris/ville-75056 http://greater-paris-investment-agency.com/ http://www.maxicours.com/se/fiche/8/1/228481.html http://tatoeba.org/eng/sentences/show/331233 http://wikitravel.org/fr/Paris --invalid links-- http://www.larousse.fr/encyclopedie/ville/Paris/137068 https://fr.vikidia.org/wiki/Paris https://fr.vikidia.org/wiki/Paris#Paris_capitale_du_royaume_cap.C3.A9tien http://www.pourquois.com/histoire_geo/pourquoi-paris-est-capitale-france.htmlTwo buttons marked as "=>" and "<=" are used to pass links between two listboxes with valid and invalid links. When links are pasted from the clipboard using Firefox extension they are presorted on valid and invalid. Those that are considered as invalid are some internal Google or Bing links with long generated IDs. They look like follows:
http://webcache.googleusercontent.com/search?q=cache:Az75KX8_OYAJ:www.huffingtonpost.comMy program puts them into invalid links category. User is free to put them back, but I would not recommend to use them, because Google may mistakenly identify such utility as programmatic wrapper around Google search engine and block the IP address.
CustomizationThis utility is free, but customization can be done for compensation. Those who are interested can contact developer. The contact information in on the main site "SemanticQuery.com".