Web-Crawler

website crawler with Selenium

The two crawler's were used to scrap real-estate information.

The code is intended to be viewed as a web-scraping referrence only. The essential keys of deploying a successful crawler:

reduce the number of unnecessary calls
Proxy Rotation
user-agent rotation(less important)
Do not call the website too fast.
leave sometime for the website to render, so you can have the full elements of the target website.

files explaination

PostCode Crawler and the cash rate crawler are the typical easy crawler's. The idea is more basic, just locate the element by CSS tags cash rate is published by the RBA, and it is an important feature that we need for our machine learning algorithm which predicts the Real estate market. crawler 1 is a little bit more complicated crawler 2 is the an more advanced crawler with proxy rotation, and proxy pre-checking.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
AC_Crawler_1.py		AC_Crawler_1.py
AC_Crawler_2.py		AC_Crawler_2.py
Cash Rate.py		Cash Rate.py
PostCode Crawler.py		PostCode Crawler.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web-Crawler

The two crawler's were used to scrap real-estate information.

files explaination

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Web-Crawler

The two crawler's were used to scrap real-estate information.

files explaination

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages