A program using Goodreads data to suggest books to read
Books give insight into a person unlike any other form of information. I have created this project because I have never found a great solution to integrating book knowledge over the long term. I find this occurs with audiobooks and podcasts as well which I consider to be a good source of information however it's equivalent to drinking a fruit smoothie rather than eating the whole fruit. The mind digests information better when it has to use the visual system to process every word rather than the audio system. Not only will this project find more books that I'll likely enjoy reading, it will be able to find rare books, it will give me a project that I can update throughout my life, it will represent my digital library, and it may turn into the knowledge repository that I use to integrate my findings from reading.
- Download data
- Create a search, make a list of books
- Create the recommendations
- Filter the recommendations
- Go to https://sites.google.com/eng.ucsd.edu/ucsdbookgraph/books
- Download book data from https://drive.google.com/uc?id=1LXpK1UfqtP89H1tYy0pBGHjYk8IhigUK
- Go to https://sites.google.com/eng.ucsd.edu/ucsdbookgraph/shelves
- Download https://drive.google.com/open?id=1zmylV7XW2dfQVCLeg1LbllfQtHD2KUon
- Download https://drive.google.com/uc?id=1CHTAaNwyzvbi1TR08MJrJ03BxA266Yxr
Creating a Term Frequency Matrix - takes all the unique words across all titles and turns them into a column in the Matrix
Go through each title - if the word exists in the title, add a 1
Rows are the book titles Columns are the terms
Inverse Document Frequency - make words that appear infrequently more meaningful. Log(number_of_titles / number_of_titles_word_appears)
Term Frequency matrix * Inverse Document Frequency matrix
- Find Similar Users
- Create Matrix
- Recommend Books
Creating a user/book matrix:
Every row will be a different user, every colum will be a different book, the cells will contain the user's rating of a book
- Create additional Quick Database diagrams
- Add the my_ratings file
- Add a formatting file for the goodreads_export data
The Name of the Wind Poor Charlie's Almanack The War of Art Lord of the Rings trilogy Zen and the art of Motorcycle Maintenance In my own way The Hitchhiker's Guide to the Galaxy Sapiens Chronicles: Volume One
Terry Pratchet Neil Gaiman Nicholas Taleb Walter Isaacson Will Macaskill Carlos Casteneda Ernest Hemingway Alan Watts Herman Hesse

