This is a simple task for someone who knows PHP & MySQL
Objective:
1.
Build an onsite webtool that will scrape school information from [login to view URL], collecting the following data:
SOURCE 1: CITY DATA EXAMPLE
(example of data needed to be captured)
city-url: [login to view URL]
preschools: 15
elementary: 13
middleschool: 4
highschool: 2
public: 8
private: 18
school district1: Manhattan Beach Unified School District
school district1-url:
..2:
..2-url:
..3:
..3-url:
city_name: Manhattan Beach
median_household_income: 130,443
median_home_price: 1,278,980
unemployment: 5%
violent_crime_index: 3
days_per_year_some_sun: 286
days_per_year_some_precipitation: 34
population: 37,745
SOURCE 2: SCHOOL DATA EXAMPLE
school-url: [login to view URL]
school_name: Mira Costa High School
Status: Public
preschool: no
elementary: no
middle: no
jr_high: no
highschool: yes
address: 701 South Peck Ave.
city: Manhattan Beach
state: CA
zip_code: 90266
county: Los Angeles County
phone: 310-318-7337
fax: 310-303-3814
school-website: [login to view URL]
student_population: 2,424
religion: n/a
2.
output to 2 distinct files: [login to view URL] file and [login to view URL] file
3.
store data in mysql (later feature to output to html page)
4.
determine if site blocks attempts to scrape data, if so, implement measures to avoid being banned:
1. random timing delays to scrape data;
2. option to use proxy list that I provide to scrape data (is this possible?)
3. any other suggestions to counter banning from scraping
******
you may propose a desktop option IF using proxies or scraping data from site via website tool useage/install is not feasible.
Please