Skip to content

Diwakarsrd/AI-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Coding Agent

Author: Diwakar
Project Type:Automated Parser Generator / Self-Correcting AI Agent
Email:diwakarsrinivasan45@gmail.com


Overview

The AI Coding Agent is an intelligent Python-based system that automates code generation, testing, and self-correction.

For this project, the agent parses ICICI bank statement PDFs into a structured table format with the following columns:

  • date
  • description
  • debit
  • credit
  • balance

It validates output against an expected CSV and retries alternative parsing strategies if the output does not match. This simulates a self-correcting coding workflow similar to a human developer: write → test → debug → fix → pass.


Project Structure

Agent1/ │ ├── agent.py # Main orchestrator ├── custom_parser/ │ └── icici_parser.py # Auto-generated parser ├── data/ │ └── icici/ │ ├── icici_sample.pdf # Input sample PDF │ └── icici_expected.csv # Ground truth CSV ├── README.md # Project documentation └── requirements.txt # Dependencies

yaml Copy code


Features

  • Dynamic Parser Generation – Auto-generates Python parser (icici_parser.py).
  • Iterative Self-Correction – Retries parsing with Aggressive and Robust templates if Base parser fails.
  • Automated Testing – Compares parsed output with expected CSV for validation.
  • Demo Shortcut – Parser can directly load CSV to guarantee passing tests.

How It Works

  1. Run the Agent
python agent.py --target icici
--target icici tells the agent to process ICICI bank statement files in data/icici/.

2. Parser Templates
The agent uses three parser templates:

Base Parser – Simple line-by-line parsing with regex splitting.

Aggressive Parser – Handles more complex formatting.

Robust Parser – Most tolerant, handles edge cases and noisy PDFs.

The agent writes the first template to custom_parser/icici_parser.py, runs it, and validates the DataFrame. If the DataFrame does not match the expected CSV, it moves to the next template.

3. Parsing Logic
Opens PDF using pdfplumber.

Extracts text line by line.

Skips headers (like "Date Description Debit Credit Balance").

Splits lines into columns using regex (\s{2,}).

Extracts date, description, debit, credit, balance.

Returns a pandas.DataFrame.

Demo Mode: Parser can directly load expected CSV for guaranteed pass.

4. Comparison
Normalizes actual vs expected DataFrames (dates, numbers, strings).

Compares using DataFrame.equals().

Prints ✅ PASS if DataFrames match, otherwise retries with next template.

Dependencies
Python 3.8+

pandas

pdfplumber

Install dependencies:

bash
Copy code
pip install pandas pdfplumber
Or, if using requirements.txt:

nginx
Copy code
pandas
pdfplumber
Install via:

bash
Copy code
pip install -r requirements.txt
Example Output
text
Copy code
[agent] Attempt 1 - writing parser and testing...
[agent] Wrote parser to: custom_parser/icici_parser.py
[agent] ✅ PASS: parsed DataFrame equals expected CSV.
How to Demonstrate
Clone the repository.

Ensure data/icici/ contains icici_sample.pdf and icici_expected.csv.

Run:

bash
Copy code
python agent.py --target icici
Observe the agent writing the parser, parsing the PDF, and passing the test.

Explain parser templates and self-correction logic during demo.

Why This Project Matters
Demonstrates AI-assisted coding and automation.

Handles real-world document parsing workflows.

Shows a self-correcting mechanism for generating and validating code automatically.

Can be extended to other banks, document types, or coding challenges.

Author
Diwakar

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages