Wordpress Plugin: Sitemap Scraper and Visual Page/Category Creation

进行中 已发布的 Oct 2, 2013 货到付款
进行中 货到付款

Plugin Overview

This plugin is meant to help someone build a brand new website STRUCTURE in wordpress. Meaning, analyze the top 10 results in Google, extract out the page and category and post titles from sitemaps (or crawl through a site manually) and then allow the user to visually pick and choose which titles to use for home pages/category titles or post titles. I broke down the workflow in 2 stages.

Stage 1

---------

A) Allow for user to enter in a keyword

B) Scrape the top 10 results of google and get the domain name.

C) Put those top 10 domains into an array [1-10]

D) Now, parse through each domain and see if it has a sitemap attached to it - ie - [url removed, login to view]

E) If it does - Extract out the category/page/post names from the sitemap

E) If it does not, use scrapy or some other php class to scrape through a website based on 'X' level of depth to find all the internal links

i. User input in 2 as the level depth, and the plugin will parse 2 levels deep to identify the internal link structure. Maybe use the following routines (or variations obviously)?

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

F) The plugin will then give scores to each of the page title/category title/post title (the first website in rank one gets 100 points for each title/category since Google has ranked it as being important first, and each site under it starts with a sliding value due to it's position..2nd website starts with 90, third starts with 80, and so forth. If there are any titles that MATCH the original 1st position website, that title will get a score of 100)

Please see referenced document

Stage 2

-------------

This is the trickier part, as the above is more along the lines of scraping and parsing with regex or some other DOM component. Stage 2 now provides those Titles and Categories in a visual box list to enable a user to manually drag and drop and create their structure based on the titles and categories. There is an abandoned plugin in the wordpress plugin directory called Visual Site Manager. It sort of doesn't work 100% right now because the devs haven't updated it.

[url removed, login to view]

here's a good write up of the plugin:

[url removed, login to view]

You can test install it and see how it works, but again, all the funcationality isn't there right now.

What I like about it is the usage of JIT Spacetree Visualization.

[url removed, login to view]

Obviously if it's a brand new site, all that the main canvas would have is the top box representing the root of the site. By being able to drag and drop each box onto the tree map that JIT enables you to build, one can take a predefined set of titles scraped and ranked in order of importance from other known authority sites and then build a site structure based on that.

Now, if someone has an existing site, then the main JIT canvas will show the existing site structure and allow someone to change titles on there (click on node, bring up edit text screen to change info) or just drag and drop new categories/pages/postings into the site structure.

Please have wordpress, jit, and scraping experience

MySQL PHP 网页搜罗 WordPress

项目ID: #4987553

关于项目

2个方案 远程项目 活跃的Oct 3, 2013

有2名威客正在参与此工作的竞标,均价$165/小时

phpmysqlrocks

Your requirement is very much clear to us so we are ready to start immediately. Thanks Chirag our past job for data mining and scraping https://www.freelancer.com/projects/Data-Mining/Data-Mining-from-web-site.ht 更多

$187 USD 在3天内
(35条评论)
5.4
ankasky

sir i am a software engineer and i am interested in this task i have 3 years experience in this field i did too many projects and know recently start freelancing to prove my skills and expertise if you are interested t 更多

$185 USD 在15天内
(18条评论)
3.9
WemoTechnologies

Respected Customer G'Day,,This is Hussein in charge of sales/Marketing. our strong technical experts will explain you every function and beat of features. After clarifications if you feel comfortable and satisfied t 更多

$233 USD 在5天内
(1条评论)
1.4