Tutorial: Aarini Data Toolkit
The data-toolkit is an automated data processing application. It watches a specific folder for incoming CSV data files and their associated metadata. Once a new file is detected, the application automatically converts it into the highly efficient Parquet format. Finally, it uploads both the new Parquet file and a backup of the original CSV to a secure Google Cloud Storage bucket. All settings for this pipeline are managed through a simple graphical user interface.
Source Repository: Data Toolkit
Chapters
- Application GUI and Configuration
- File System Monitor
- CSV-to-Parquet Conversion Engine
- Cloud Storage Uploader
- File Naming and Metadata Logic
- Singleton Process Manager (Windows)
Generated by AI Codebase Knowledge Builder