
Language
Java
Tool Type
Desktop app
License
The MIT License
Version
1.2.1
Manuel Aristarán

Tabula is a desktop application that extracts tables from PDF files to convert them into editable formats such as CSV or Excel. Designed to overcome the limitations of data manipulation in PDF, Tabula is essential in data analysis and journalistic investigation. Developed to simplify data extraction from PDF documents, Tabula is shared to improve the accessibility and manipulation of data in non-editable formats. Used widely in journalism and data analysis, its ability to transform table information into usable formats is invaluable.
Tabula solves the problem of accessing and manipulating data trapped in tables within PDF documents, a notoriously difficult format for editing and extraction. It facilitates data analysis and supports journalists and analysts in their investigative work.
Converts text-based PDF tables to CSV/Excel. Local processing for data security. User-friendly web interface. Supports multiple platforms (Windows, Mac, Linux). Does not require internet access.
Requires a Java runtime environment compatible with Java 7 or higher. Integrates JVM languages for enhanced flexibility. Offers JRUBY and R bindings to facilitate development in different languages. Deploys applications efficiently using Docker for containerization. Adopts an MIT license, promoting open use and distribution of the software.

Connect with the Development Code team and discover how our carefully curated open source tools can support your institution in Latin America and the Caribbean. Contact us to explore solutions, resolve implementation issues, share reuse successes or present a new tool. Write to [email protected]

Table with budget deficit projections for 2012-2022. Shows annual and total figures for growth measures, health, defense, tax revenues, and more, in billions of dollars.

This image presents a Java code snippet for an API usage example that extracts rows and cells from tables in a PDF document using Apache PDFBox.

Tabula thanks: Gratitude for tabula-java support from donors like DB, IN, BS, SG, ER. Also, thanks to Knight Foundation and Shuttleworth Funded for their financial backing.
Main page to download and see how it works
Step by step on how to extract data from PDFs
Practical guide to use for governments and journalists
Practical demonstration in Spanish
