Phase5

The fifth phase of our project. In this phase we are hoping to create a final classifier of PDF files. The final classifier will be based on the three previous machines in our project: image, text, and features. The process of the final machine will be the following:

install: sudo pip3 install xgboost
Extract all data needed for the three base machines (image, text, features) - this is done using classes, imported into as.py.
Create base vectors for every sample (image, text, features)
Run every base machine on the samples, and return the calssification of the sample by every machine.
Create a vector for the boost algorithm from the base machines classifications for every sample.
Run boost algorithm with RF on sample boost vectors.
Return boost algorithm accuracy.

Hello darkness my old friend

you need to install python2 - and then install a shit ton of libraries as well.

This thing runs on python3 but calls python2 on several occasions.

You also need python3. you should find a way so python redirects to python2 and python3 does python3 as usual. if you fail to do so, edit the classes (createDATA, etc...) and manually set the 'python' command to 'python2'

python2 -m pip install opencv-python==4.2.0.32

nodejs -v v8.10.0

npm -v 3.5.2

in jast, in folder js, in is_js.py - change nodejs to node if you used nvm to install nodejs (fucking shit)...

pip2 install pdfminer==20140328

pip3 install pdfminer.six==20181108

In ExtractJS.txt: replace: extract js > /home/tzar/Desktop/Final_Project/phase5/JSfromPDF.txt

with: extract js > /newdrive/home/tzar/Desktop/Final_Project/shir_test/JSfromPDF.txt

https://stackoverflow.com/questions/10592605/save-classifier-to-disk-in-scikit-learn

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
classes		classes
.gitignore		.gitignore
AdaBoostClassifier_1.png		AdaBoostClassifier_1.png
AdaBoostClassifier_2.png		AdaBoostClassifier_2.png
AdaBoostRegressor_1.png		AdaBoostRegressor_1.png
AdaBoostRegressor_2.png		AdaBoostRegressor_2.png
AdaBoostRegressor_3.png		AdaBoostRegressor_3.png
AdaBoostRegressor_4.png		AdaBoostRegressor_4.png
AdoBC_new_1.png		AdoBC_new_1.png
AdoBC_new_2.png		AdoBC_new_2.png
AdoBR_new_1.png		AdoBR_new_1.png
AdoBR_new_2.png		AdoBR_new_2.png
AdoBR_new_3.png		AdoBR_new_3.png
AdoBR_new_4.png		AdoBR_new_4.png
AdoBR_new_5.png		AdoBR_new_5.png
LICENSE		LICENSE
README.md		README.md
XGBC_new_1.png		XGBC_new_1.png
XGBC_new_2.png		XGBC_new_2.png
XGBC_new_3.png		XGBC_new_3.png
XGBC_new_4.png		XGBC_new_4.png
XGBC_new_5.png		XGBC_new_5.png
XGBClassifier_1.png		XGBClassifier_1.png
XGBClassifier_2.png		XGBClassifier_2.png
XGBClassifier_3.png		XGBClassifier_3.png
XGBClassifier_4.png		XGBClassifier_4.png
XGBClassifier_5.png		XGBClassifier_5.png
XGBClassifier_6.png		XGBClassifier_6.png
XGBR_new_1.png		XGBR_new_1.png
XGBR_new_2.png		XGBR_new_2.png
XGBR_new_3.png		XGBR_new_3.png
XGBR_new_4.png		XGBR_new_4.png
XGBRegressor_1.png		XGBRegressor_1.png
XGBRegressor_2.png		XGBRegressor_2.png
XGBRegressor_3.png		XGBRegressor_3.png
XGBRegressor_4.png		XGBRegressor_4.png
XGBRegressor_5.png		XGBRegressor_5.png
as.py		as.py
as2.py		as2.py
as3.py		as3.py
kmc.png		kmc.png
kmc_new.png		kmc_new.png
knn.png		knn.png
knn_new.png		knn_new.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phase5

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Phase5

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages