Skip to main content

Data cleaner

Data cleaner
Get the code

Language

Python

Tool Type

Algorithm

License

The MIT License

Version

0.2.1

About the tool Responsible

Head of Cabinet of Ministers

Data cleaner
What is it?

Data cleaner is a tool developed by the Argentine government to optimize the management and federation of metadata from public data catalogs. It makes it easy for users to clean CSV files through a set of predefined rules, promoting the transparency and accessibility of government information. Data cleaner is part of a broader open data initiative that seeks to improve the accuracy, organization and usefulness of publicly available data.

What problems does it solve?

Data cleaner solves the problem of disorganization and errors in public records metadata, facilitating more efficient and transparent management of open data. This improvement is vital for the integrity and accessibility of government information, promoting public transparency.

How does the tool work?

Automated data cleaning based on predefined rules. Methods for individual data cleaning tasks like string normalization and email formatting. Customizable rules for data cleaning, including renaming and removing columns, and handling duplicates. Support for various data formats, including CSV and SHP files. Integration with pandas for DataFrame manipulation.

Open standards

Built on Python 3.6, leverages pandas and geopandas for efficient data manipulation. Adheres to UTF-8 encoding standards, ensuring precise data management. Supports CSV, GeoJson, and KML file formats, promoting interoperability. Utilizes arrow for date analysis and formatting, optimizing temporal processing.

Sector
Reform or Modernization of the State
Functionality
Database management
Sustainable development goals
Partnership for the goals
hands
Get the code for this project
Get the code

Connect with the Development Code team and discover how our carefully curated open source tools can support your institution in Latin America and the Caribbean. Contact us to explore solutions, resolve implementation issues, share reuse successes or present a new tool. Write to [email protected]

Contact us
Data Transformation Process Data Transformation Process

Scatter plot with red dots for "Raw Data" and green line for "Cleaned Data". Axes: "Data Attribute 1" (0 to 10) and "Data Attribute 2" (0 to 10). Title: "Data Cleaning: From Chaos to Order".

Data Cleaner Library Visual Summary Data Cleaner Library Visual Summary

Text image about "Data Cleaner Library": optimizes data processing tasks, follows Argentina's standards, automates cleaning, under development, handles strings, emails, dates, seeks precision and standard.

Data Cleaning Rules and Text Description Data Cleaning Rules and Text Description

This image contains a Spanish-language text and Python code related to data cleaning, specifying rules for capitalizing and date formatting in a CSV file.

Official Data Cleaner Documentation

Basic documentation page

See more
Open Data Package Tool Page

Information on tools related to open data

See more
MAIIA
Identifying informal settlements with artificial intelligence.

Urban Development and Housing
Image processing
Urbantrips
Turning transportation data into complex analysis to improve management.

Transport
Geolocation
Pavimentados
Optimizing road maintenance and signaling with computer vision.

Transport
Geolocation
Image processing
UrbanPy
Simplifying urban data collection and analysis for effective planning.

Urban Development and Housing
Geolocation
Database management
SunScan IDB
Facilitating the evaluation of rooftop solar potential with advanced and accessible technology.

Energy
Geolocation
Image processing
URSA
Facilitating urban planning with accessible data.

Urban Development and Housing
Simulators
Geolocation
see all tools
hands
Deepen your knowledge on the implementation of tools in the public sector with our courses, guides and many other resources.
Be part of the community
Jump back to top