Find Jobs
Hire Freelancers

299916 Cloaking and page generation

N/A

进行中
已发布超过 20 年前

N/A

货到付款
I need a search engine cloaking script preferably in perl (or possibly php) and web interface on a linux server running apache and mysql, which generate pages for search engines to crawl while redirecting regular users to specified URLs. Here are the details of how I envisage the interface to look. Firstly, there will be one screen for page generation as follows. 1. Domain name selection menu The domain names on my server each have a sub-directory for html pages that are located in the same directory. I envisage that these domain names can be read by examining the directory structure. I have full admin rights to the servers. 2. Search term input box These are a list of search terms that html pages will be generated for. Each keyword will be correspond to a single generated page. The name of the generated page will be the search term (which may actually consist of more than one word) with underscores separating the component words of the search term. 3. Text database selection menu There will be a number of large text blocks consisting of several thousand words or more, from which pages will be generated in conjunction with search terms and templates. These texts can be read by assuming some form of file name extention in a data directory (e.g. .bk) 4. HTML template selection menu A template is just a basic html page with formatting as well as several embedded pre-defined tags with varying parameters. Templates can be recognized by their file name extension in a specified directory. An example of each required tag is below a) [KEYPARA 500, 22-69, 6] : specifies that in this position in the generated html, a paragraph consisting of 500 words should be from a random contiguous section of the selected text database (specified in 3) with the search terms between the position 22% of the way within the search term list (specified in 2) and 69% of the way (rounded sensibly) inserted between any two words within the paragraph with a probability (i.e. approximate density) of 6%. b) [KEYWORDZ 33-49, 4] specifies that in this position of the generated HTML, 4 search terms randomly selected from between position 33% and 49% of the search term list, should be inserted. c) [KEYLINK 4] : specifies that this page is part of a group of 4 pages which link to 4 pages corresponding to the 4 search terms (where they exist) immediately above and below the (contiguous set of) search terms corresponding to this set of pages in the search term list specified in 2. c) [KEYLINKDOWN 4] : specifies that this page is part of a group of 4 pages which link to 4 pages corresponding to the 4 search terms (where they exist) immediately below the (contiguous set of) search terms corresponding to this set of pages in the search term list specified in 2. There will be be a few additional tags required which would be simple extensions to the above. 5. Redirection URL (with search term parameter) Every user will enter the site via a search engine results page which lists a particular generated page. The user will never see the generated page but will instead be redirected immediately to this redirection URL. The redirection URL may contain the tag [KEYWORD] which is to be replaced by the search term corresponding to the name of the file via which the user entered. e.g. [login to view URL][KEYWORD]. NOTE: The redirection and spider detection MUST be done using .htaccess which hence will need to be modified by the script. There will be another screen required for specifying the IP addresses of the search engine spiders. Cloaking is performed by recognizing the IP address of the spider and taking a different action than for a regular user. It is critical these IP adresses are complete, and correctly detected in real time. This screen should look something like this 6. A search engine selection menu To specify what search engine we are working on. This is mainly for organizational neatness. The cloaking will be the same for all engines. 7. An IP address input box Here IP addresses can be specified in the box. It is also input that this input box allows the specification of IP ranges without enumerating every single IP address. I will probably need a third screen which specifies global parameters such as the location of the domain directory, the location of the html directories relative to the domain directory, the location of the template and text database files. Maybe also a "load text" function for loading the text files into a MYSQL database for faster access? I guess the need for this depends on access times needed for page generation. So the basic modus operandi is - when a user clicks on a page of one of the specified domain names from a search engine result page, he will automatically be redirected to the redirection URL specified in 5. When a search engine spider comes to crawl a page on the domain, it is allowed to do so without being redirected. If you need more information to make a sensible bid, you are welcome to message me with your questions. I will not select a bidder until I am confident that bidder understands the scope and requirements of the project. Of course I am looking for the lowest price here, but a bidder who demonstrates he/she understands the requirements is worth more to me than a bidder who makes a lower bid and doesn't. Suggestions about how to do things better are always welcome too.
项目 ID: 2046002

关于此项目

远程项目
活跃12 年前

想赚点钱吗?

在Freelancer上竞价的好处

设定您的预算和时间范围
为您的工作获得报酬
简要概述您的提案
免费注册和竞标工作

关于客户

的国旗
5.0
4
会员自10月 31, 2003起

客户认证

谢谢!我们已通过电子邮件向您发送了索取免费积分的链接。
发送电子邮件时出现问题。请再试一次。
已注册用户 发布工作总数
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
加载预览
授予地理位置权限。
您的登录会话已过期而且您已经登出,请再次登录。