Skip to main content

Tutorial: Aarini Data Toolkit

The data-toolkit is an automated data processing application. It watches a specific folder for incoming CSV data files and their associated metadata. Once a new file is detected, the application automatically converts it into the highly efficient Parquet format. Finally, it uploads both the new Parquet file and a backup of the original CSV to a secure Google Cloud Storage bucket. All settings for this pipeline are managed through a simple graphical user interface.

Source Repository: Data Toolkit

Chapters

  1. Application GUI and Configuration
  2. File System Monitor
  3. CSV-to-Parquet Conversion Engine
  4. Cloud Storage Uploader
  5. File Naming and Metadata Logic
  6. Singleton Process Manager (Windows)

Generated by AI Codebase Knowledge Builder