1.K-NN for indicate image:
pip3 install -U scikit-learninstalling scikit-learnpip3 install imutilsinstalling imutilspip install opencv-pythoninstalling cv2- Run command
python3 newKNN.py -d DATA_FOLDER
2.Save the thumbnail picture using - pdf2image:
pip3 install pdf2imagepip3 install temppip3 install tempfile- Run command
python3 new_thumbnailPDF.py
3.Randomly select a group of PDF files:
- Run command
python randomWhiteList.py Number
4.Replace a file names of PDF to Kind.Number.pdf:
- Run command
python Myrename.py StartIndex Kind
5.Extract /Root/Lang attribute and choose language for tessarcat:
pip3 install PyPDF2- Run command
python3 detectLang2.py
6.Detect if image is blur:
- Run command
python detectBlur.py
7.elbowMetod find optimale k:
- Run command
python3 elbowMetod -d DATA_FOLDER
8.KMC for clustering image:
pip3 install seaborn- Run command
python KMC.py -d DATA_FOLDER