Skip to content

basvdberg/Media-Library

Repository files navigation

Photo and movie Organizer

Introduction

This repo contains a set of tools to help you to organize your photo and movie collections.

The eventual goal is to consolidate different photo and movie archives (like google photos, whatsapp and digital camera) into one archive. The reasons for doing this are:

  • Have all your media together and easy accessible in a chronological order.
  • Reduce costs for cloud storage, Like Apple and Google, By taking (a part) of your media offline.
  • Create an offline backup on any kind of storage device ( e.g. usb flash drive or external harddisk)
  • Allows you to create compilations of all your media.

Programs

To achieve this goal I have made several programs:

organize_google_takeout.py.

Fortunately Google allows you to download your photos and movies, but the metadata is stored in external JSON files. This metadata contains important fields like the date and time when the photo was taken and the geographical information. This program will find the corresponding JSON file, which appeared to be more difficult than it seems because there can be many naming conventions for the JSON file.

This program fixes 2 things:

  1. The filenames of the media files should follow the convention PhotoTakenTimestamp-AlbumName-OriginalImageName.ext. E.g. 20210130T203201-Vacation Italy-IMG021.JPG. This is fixed by making a copy of each media file in the target folder.
  2. The EXIF meta data of the media files.

detect_duplicates.py

Identifies duplicate files based on exact same create datetime and exact same size. outputs a csv fil_e with duplicates that can be removed by calling delete_duplicate_files.py

Details

organize_google_takeout.py.

Find corresponding JSON file.

  • Creates a CSV file (JSON-mapping.csv) in the Metadata sub folder containing:
    • Original media filename
    • JSON filename (original associated metadata file)
JSON File Matching

The program handles various JSON file naming patterns used by Google Takeout with flexible pattern matching:

  1. Direct match (.json, .suppl.json, .supplemental-metadata.json)
  2. Double dots (image.jpg..json - Google Takeout bug)
  3. Duplicate files (image(1).jpgimage.jpg.json or image.HEIC.supplemental-metadata.json)
  4. Edited images (image-bewerkt.jpgimage.jpg.json)
  5. Different extensions (IMG_0001.MP4IMG_0001.HEIC.supplemental-metadata.json)
  6. Truncated supplemental-metadata - Matches ANY truncation of "supplemental-metadata":
    • supplemental-metadata.json
    • supplemental-meta.json
    • supplemental-me.json
    • supplemental-m.json
    • supplemental.json
    • supplemen.json
    • supplem.json
    • supple.json
    • suppl.json
    • And any other truncation starting with "suppl"
  7. Truncated filenames (handles names >46 characters by removing trailing characters)
  8. JSON files with duplicate markers (image.mov.supplemental-met(1).json)
  9. Case-insensitive matching (HEIC vs heic, JPG vs jpg)
  10. Direct match without extension (IMG_NIGHT_xxx.movIMG_NIGHT_xxx.json)

The matching system uses multiple strategies:

  • Direct file path matching (fastest)
  • Regex pattern matching (flexible)
  • Fallback pattern matching (catches edge cases)
Missing JSON Metadata Files

Some images may not have corresponding JSON files. Common reasons:

  1. Split archives - images and JSON may be in different archives (a google takeout can be split into different .tar.gz files).
  2. Deleted photos - photos deleted before export may not have metadata files.

The program will report which images are missing JSON files and provide analysis.

EXIF Metadata Restoration

Updates EXIF data in image files.

  • DateTimeOriginal: Uses photoTakenTime (when photo was taken)
  • DateTimeDigitized: Uses creationTime (when photo was digitized/created in Google Photos)
  • GPS coordinates (latitude, longitude, altitude)
  • Camera information (make, model)
  • Photo settings (focal length, aperture, ISO, exposure time)

Copy and rename images to target folder

  • Copies media files to a target folder using this rename pattern: PhotoTakenTimestamp-AlbumName-OriginalImageName.ext Example: 20231215T143022-MyAlbum-IMG_1234.jpg
  • Copies JSON metadata files to Metadata/Google subfolder
  • Handles duplicate filenames automatically
  • Supports --overwrite flag to control whether existing files are skipped or overwritten.

delete_duplicates.py

  • Scans a folder for duplicate media files.
  • Two media files are considered duplicates if:
    • They have the exact same create datetime (in EXIF meta data)
    • They have the exact same size
  • Creates duplicates.csv containing the duplicate filenames
  • Creates duplicates.html with a listing of all duplicates in 2 columns and options to select all or a subset of the duplicates, plus a button to deduplicate the selected files (the first occurrence will be kept)

Installation

  1. Install Python 3.9 or higher

  2. Install dependencies from requirements.json:

    python install.py

    Or install manually:

    pip install piexif pillow

Log File

The program automatically creates a log file in the Metadata folder:

  • Filename format: log_YYYYMMDD_HHMMSS.txt
  • Contains all console output (with ANSI color codes removed)
  • Useful for reviewing processing history and debugging
  • Log file path is displayed at the end of execution

About

Utility to fix EXIF metadata based on Google json meta data and secondly rename files using EXIF meta data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages