You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Selector extraction: I first inspected the target site and manually extracted all the required selectors for each step :- registration, login, closing popups, navigating to deposit, selecting channels and amounts, and reaching the payment page.
Async Playwright flow: Using these selectors, I implemented the automation in flow.py with Playwright’s async API. Each run creates a new account, logs in, and executes the deposit flow.
Result capture: At the end of the flow, the scraper captures either a UPI ID or marks "QR_ONLY", saves the payment page URL, and takes a screenshot of the QR/UPI section.
Management: In runner.py, I added retry logic, concurrency control, and CSV writing. This ensures that even if some runs fail due to dynamic UI issues, the scraper continues until 7,000 successful records are collected.
Final dataset: The results are written into output.csv with deterministic screenshot paths, producing a reproducible dataset of 7,000 rows.
Limitations
The scraper is tightly coupled to the current structure of the target site. If the site’s UI or selectors change, the automation will need updates.
About
Async Python scraper using Playwright that automates account creation to payment flow and captures UPI details, URLs, timestamps, and screenshots.