
Language
Python
Tool Type
Algorithm
License
The MIT License
Version
0.2.1
Head of Cabinet of Ministers

Data cleaner is a tool developed by the Argentine government to optimize the management and federation of metadata from public data catalogs. It makes it easy for users to clean CSV files through a set of predefined rules, promoting the transparency and accessibility of government information. Data cleaner is part of a broader open data initiative that seeks to improve the accuracy, organization and usefulness of publicly available data.
Data cleaner solves the problem of disorganization and errors in public records metadata, facilitating more efficient and transparent management of open data. This improvement is vital for the integrity and accessibility of government information, promoting public transparency.
Automated data cleaning based on predefined rules. Methods for individual data cleaning tasks like string normalization and email formatting. Customizable rules for data cleaning, including renaming and removing columns, and handling duplicates. Support for various data formats, including CSV and SHP files. Integration with pandas for DataFrame manipulation.
Built on Python 3.6, leverages pandas and geopandas for efficient data manipulation. Adheres to UTF-8 encoding standards, ensuring precise data management. Supports CSV, GeoJson, and KML file formats, promoting interoperability. Utilizes arrow for date analysis and formatting, optimizing temporal processing.

Connect with the Development Code team and discover how our carefully curated open source tools can support your institution in Latin America and the Caribbean. Contact us to explore solutions, resolve implementation issues, share reuse successes or present a new tool. Write to [email protected]

Scatter plot with red dots for "Raw Data" and green line for "Cleaned Data". Axes: "Data Attribute 1" (0 to 10) and "Data Attribute 2" (0 to 10). Title: "Data Cleaning: From Chaos to Order".

Text image about "Data Cleaner Library": optimizes data processing tasks, follows Argentina's standards, automates cleaning, under development, handles strings, emails, dates, seeks precision and standard.

This image contains a Spanish-language text and Python code related to data cleaning, specifying rules for capitalizing and date formatting in a CSV file.
Basic documentation page
Information on tools related to open data
