Skip to main content

Tabula

Tabula
Get the code

Language

Java

Tool Type

Desktop app

License

The MIT License

Version

1.2.1

About the tool Responsible

Manuel Aristarán

Tabula
What is it?

Tabula is a desktop application that extracts tables from PDF files to convert them into editable formats such as CSV or Excel. Designed to overcome the limitations of data manipulation in PDF, Tabula is essential in data analysis and journalistic investigation. Developed to simplify data extraction from PDF documents, Tabula is shared to improve the accessibility and manipulation of data in non-editable formats. Used widely in journalism and data analysis, its ability to transform table information into usable formats is invaluable.

What problems does it solve?

Tabula solves the problem of accessing and manipulating data trapped in tables within PDF documents, a notoriously difficult format for editing and extraction. It facilitates data analysis and supports journalists and analysts in their investigative work.

How does the tool work?

Converts text-based PDF tables to CSV/Excel. Local processing for data security. User-friendly web interface. Supports multiple platforms (Windows, Mac, Linux). Does not require internet access.

Open standards

Requires a Java runtime environment compatible with Java 7 or higher. Integrates JVM languages for enhanced flexibility. Offers JRUBY and R bindings to facilitate development in different languages. Deploys applications efficiently using Docker for containerization. Adopts an MIT license, promoting open use and distribution of the software.

Sector
Science and Technology
Functionality
Data collection analysis and visualization
Sustainable development goals
Partnership for the goals
Toolkits
Topic - Municipalities
hands
Get the code for this project
Get the code

Connect with the Development Code team and discover how our carefully curated open source tools can support your institution in Latin America and the Caribbean. Contact us to explore solutions, resolve implementation issues, share reuse successes or present a new tool. Write to [email protected]

Contact us
Budget Proposal Deficit Impact Budget Proposal Deficit Impact

Table with budget deficit projections for 2012-2022. Shows annual and total figures for growth measures, health, defense, tax revenues, and more, in billions of dollars.

PDF Table Data Extraction Java Code Snippet PDF Table Data Extraction Java Code Snippet

This image presents a Java code snippet for an API usage example that extracts rows and cells from tables in a PDF document using Apache PDFBox.

Tabula Project Sponsors Tabula Project Sponsors

Tabula thanks: Gratitude for tabula-java support from donors like DB, IN, BS, SG, ER. Also, thanks to Knight Foundation and Shuttleworth Funded for their financial backing.

Tabula official site

Main page to download and see how it works

See more
Tutorial: Junta de Andalucía

Step by step on how to extract data from PDFs

See more
IDB Blog: release data in PDFs

Practical guide to use for governments and journalists

See more
Video: Tabula by Zonalitica

Practical demonstration in Spanish

See more
Pavimentados
Optimizing road maintenance and signaling with computer vision.

Transport
Geolocation
Image processing
UrbanPy
Simplifying urban data collection and analysis for effective planning.

Urban Development and Housing
Geolocation
Database management
SunScan IDB
Facilitating the evaluation of rooftop solar potential with advanced and accessible technology.

Energy
Geolocation
Image processing
URSA
Facilitating urban planning with accessible data.

Urban Development and Housing
Simulators
Geolocation
MAIIA
Identifying informal settlements with artificial intelligence.

Urban Development and Housing
Image processing
Urbantrips
Turning transportation data into complex analysis to improve management.

Transport
Geolocation
see all tools
hands
Deepen your knowledge on the implementation of tools in the public sector with our courses, guides and many other resources.
Be part of the community
Jump back to top