PDF Table Extraction API

Extract structured tables from PDF files and return them as JSON, Excel, or CSV. Automatically detects single or multiple tables, supports multi-page PDFs, and delivers deterministic, machine-readable output for data pipelines and automation.

Use this API from your AI agent via MCP

Works with OpenClaw, Claude Code/Desktop, Cursor, Windsurf, Cline and any MCP-compatible AI client.

Docs & setup

Create a skill by wrapping this MCP: https://mcp.zylalabs.com/mcp?apikey=YOUR_ZYLA_API_KEY

PDF Table Extraction API enables developers to reliably extract structured tabular data from PDF documents and convert it into machine-readable formats such as JSON, Excel, or CSV.

This API focuses exclusively on true table extraction, not general PDF text parsing. It automatically detects grid-based tabular structures within PDFs and ignores non-tabular content such as titles, headers, footers, and paragraphs. This makes it ideal for automation, ETL pipelines, data ingestion workflows, and backend systems that require clean, predictable output.

Key Capabilities

Detects and extracts one or multiple tables from a single PDF
Supports tables spanning multiple pages
Returns results in JSON, Excel (.xlsx), or CSV
Multiple tables are returned as:
- An array in JSON
- Separate worksheets in Excel
- Separate CSV files packaged in a ZIP archive
Deterministic output: same input always produces the same result
Optional confidence scores per table
Designed for automation and backend use cases

What This API Does

Identifies tabular data based on layout and structure
Preserves row and column alignment
Handles irregular tables, empty cells, and uneven rows
Returns structured output suitable for programmatic processing

What This API Does NOT Do

Does not extract free-form text outside tables
Does not perform OCR on scanned PDFs
Does not attempt semantic interpretation of table contents
Does not modify or enrich data values

Example Use Cases

Extract invoice line items from PDF documents
Convert financial reports into structured datasets
Ingest tabular data from customer-uploaded PDFs
Automate data pipelines from PDF sources
Replace manual copy-paste workflows

Output Formats

JSON

Tables returned as an array
Each table includes rows, page range, and confidence score

Excel (.xlsx)

One workbook per request
Each table placed in a separate worksheet

CSV

Each table exported as a separate CSV file
All CSV files returned in a ZIP archive

API Characteristics

Stateless and privacy-friendly
No data is stored after processing
Secure HTTPS-only communication
Suitable for production workloads

Limitations

Maximum PDF size limits apply
Text-based PDFs only (no OCR support)
Tables must be visually structured (grid or aligned rows)

Designed For Developers

This API is designed for developers who need reliable table extraction, predictable output, and clean integration into automated systems — without the complexity or cost of large enterprise document platforms.

Summary

If you need structured data from PDF tables — not text blobs, not images, and not manual cleanup — this API provides a fast, deterministic, and developer-friendly solution.

API Documentation

Endpoints

Extract Data

Extracts structured tabular data from PDF documents and returns it in machine-readable formats. Automatically detects one or more tables within a PDF, ignores non-tabular text, and outputs clean data as JSON, Excel (multiple sheets), or CSV. Designed for automation, data pipelines, and backend processing with deterministic results.

                                                                            
POST https://www.zylalabs.com/api/11754/pdf+table+extraction+api/22299/extract+data

Extract Data - Endpoint Features

Object	Description
`pages`	Optional Pages to extract. Examples: "all", "1,3-5", or [1,3,4,5]
`fileBase64`	Optional Base64-encoded PDF (alternative to multipart upload)
`Request Body`	[Required] File Binary

Test Endpoint

API EXAMPLE RESPONSE

       
                                                                                                        
                                                                                                                                                                                                                                                                                                                                        {"tables":[{"tableIndex":0,"pageRange":[1,1],"rows":[["Lorem ipsum","","","","","","","",""],["condimentum.","Vivamus","dapibus","sodales","ex,","vitae","malesuada","ipsum","cursus"],["convallis. Maecenas sed egestas nulla, ac condimentum orci.","Mauris diam felis,","","","","","","",""],["ac accumsan nunc vehicula vitae.","Nulla eget justo in felis tristique fringilla. Morbi sit amet","","","","","","",""],["","Maecenas non lorem quis tellus placerat varius.","","","","","","",""],["","Aenean congue fringilla justo ut aliquam.","","","","","","",""],["","Mauris id ex erat.","Nunc vulputate neque vitae justo facilisis, non condimentum ante","","","","","",""],["sagittis.","","","","","","","",""],["","Morbi viverra semper lorem nec molestie.","","","","","","",""],["","Maecenas tincidunt est efficitur ligula euismod, sit amet ornare est vulputate.","","","","","","",""],["12","","","","","","","",""],["10","","","","","","","",""],["8","","","","","","","",""],["Column 1","","","","","","","",""],["6","","","","","","","",""],["Column 2","","","","","","","",""],["4 Column 3","","","","","","","",""],["2","","","","","","","",""],["0","","","","","","","",""],["Row 1","Row 2","Row 3","Row 4","","","","",""]],"rowCount":20,"columnCount":9,"strategyUsed":"stream","warnings":[],"confidence":0.85},{"tableIndex":1,"pageRange":[2,2],"rows":[["velit.","Pellentesque","fermentum","nisl","vitae","fringilla","venenatis.","Etiam","id","mauris","vitae","orci"],["a.","","","","","","","","","","",""],["Lorem ipsum","Lorem ipsum","Lorem ipsum","","","","","","","","",""],["1","In eleifend velit vitae libero sollicitudin euismod.","Lorem","","","","","","","","",""],["2","Cras fringilla ipsum magna, in fringilla dui commodo Ipsum","","","","","","","","","",""],["a.","","","","","","","","","","",""],["3","Aliquam erat volutpat.","Lorem","","","","","","","","",""],["4","Fusce vitae vestibulum velit.","Lorem","","","","","","","","",""],["5","Etiam vehicula luctus fermentum.","Ipsum","","","","","","","","",""],["et","pulvinar","nunc.","Pellentesque","fringilla","mollis","efficitur.","Nullam","venenatis","commodo","",""]],"rowCount":10,"columnCount":12,"strategyUsed":"stream","warnings":[],"confidence":0.85},{"tableIndex":2,"pageRange":[3,3],"rows":[["elit.","","","","","","","","","","",""],["dictum tellus.","","","","","","","","","","",""],["Aliquam","erat","volutpat.","Vestibulum","in","egestas","velit.","Pellentesque","fermentum","nisl","vitae",""],["fringilla","venenatis.","Etiam","id","mauris","vitae","orci","maximus","ultricies.","Cras","fringilla","ipsum"],["et","pulvinar","nunc.","Pellentesque","fringilla","mollis","efficitur.","Nullam","venenatis","commodo","",""]],"rowCount":5,"columnCount":12,"strategyUsed":"stream","warnings":[],"confidence":0.85}],"summary":{"tableCount":3,"pageCount":4}}

Extract Data - CODE SNIPPETS


    curl --location 'https://zylalabs.com/api/11754/pdf+table+extraction+api/22299/extract+data' \
    --header 'Content-Type: application/json' \ 
    --form 'image=@"FILE_PATH"'

API Access Key & Authentication

After signing up, every developer is assigned a personal API access key, a unique combination of letters and digits provided to access to our API endpoint. To authenticate with the PDF Table Extraction API simply include your bearer token in the Authorization header.

Headers

Header	Description
`Authorization`	[Required] Should be `Bearer access_key`. See "Your API Access Key" above when you are subscribed.

Questions

Simple Transparent Pricing

No long-term commitment. Upgrade, downgrade, or cancel anytime. Free Trial includes up to 50 requests.

Monthly Annually

(Save 2 months with annual billing 🎉)

💫Basic

$24.99/Month

500 Requests / Month
Then $0.0649740 per request if limit exceeded.
Rate Limit: 60 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

Popular

⚡Pro

$49.99/Month

2,000 Requests / Month
Then $0.0649740 per request if limit exceeded.
Rate Limit: 60 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

🔥Pro Plus

$99.99/Month

6,000 Requests / Month
Then $0.0649740 per request if limit exceeded.
Rate Limit: 120 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

⚜️Premium

$199.99/Month

15,000 Requests / Month
Then $0.0649740 per request if limit exceeded.
Rate Limit: 120 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

💫Basic

$20.83/Month

500 Requests / Month
Then $0.0649740 per request if limit exceeded.
Rate Limit: 60 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

Popular

⚡Pro

$41.66/Month

2,000 Requests / Month
Then $0.0649740 per request if limit exceeded.
Rate Limit: 60 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

🔥Pro Plus

$83.33/Month

6,000 Requests / Month
Then $0.0649740 per request if limit exceeded.
Rate Limit: 120 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

⚜️Premium

$166.66/Month

15,000 Requests / Month
Then $0.0649740 per request if limit exceeded.
Rate Limit: 120 reqs per minute
Specialized Customer Support
Real-Time API Monitoring
Unlimited Data Transfer Included

Free 7-day trial

No commitment. Cancel anytime

🚀 Enterprise

Starts at
$ 10,000/Year

Custom Volume
Custom Rate Limit
Specialized Customer Support
Real-Time API Monitoring

Book a Call

Customer favorite features

✔︎ Only Pay for Successful Requests
✔︎ Free 7-Day Trial
✔︎ Multi-Language Support
✔︎ One API Key, All APIs.
✔︎ Intuitive Dashboard

✔︎ Comprehensive Error Handling
✔︎ Developer-Friendly Docs
✔︎ Postman Integration
✔︎ Secure HTTPS Connections
✔︎ Reliable Uptime

PDF Table Extraction API FAQs

What type of data does the PDF Table Extraction API return?

The API returns structured tabular data extracted from PDF documents. This includes multiple tables, each represented as an array in JSON format, with options to receive the data in Excel (.xlsx) or CSV formats.

What are the key fields in the response data?

The response includes key fields such as `tableIndex`, `pageRange`, `rows`, `rowCount`, `columnCount`, `strategyUsed`, and `confidence`. Each table's data is organized to facilitate easy programmatic processing.

How is the response data organized?

The response data is organized into a summary section that includes the total number of tables and pages, followed by an array of tables. Each table contains its rows, page range, and confidence score, making it easy to navigate and utilize.

What parameters can be used with the endpoint?

The primary parameter for the endpoint is the PDF file itself, which can be uploaded directly. Additional parameters may include options for output format (JSON, Excel, CSV) and settings for confidence scoring.

How is data accuracy maintained?

Data accuracy is maintained through deterministic output, meaning the same input consistently produces the same result. The API also provides optional confidence scores for each table, indicating the reliability of the extraction.

What are typical use cases for this data?

Typical use cases include extracting invoice line items, converting financial reports into structured datasets, automating data pipelines, and ingesting tabular data from customer-uploaded PDFs, streamlining data processing workflows.

How can users effectively utilize the returned data?

Users can leverage the structured output for integration into data pipelines, ETL processes, or backend systems. The organized format allows for easy manipulation and analysis of the extracted tables in various applications.

What are standard data patterns to expect?

Users can expect data patterns that reflect the original table structure, including row and column alignment. The API handles irregular tables and empty cells, ensuring that the output remains structured and usable for further processing.

What types of tables can the API extract from PDFs?

The API can extract various types of structured tables, including those with irregular layouts, empty cells, and uneven rows. It automatically detects single or multiple tables within a PDF, ensuring that only grid-based tabular structures are processed.

How does the API handle multi-page tables?

The API supports tables that span multiple pages, accurately capturing the entire table structure and returning it in a single output. Each table's page range is included in the response for easy reference.

Can users specify the output format for the extracted data?

Yes, users can customize their data requests by specifying the desired output format: JSON, Excel (.xlsx), or CSV. This flexibility allows integration into various applications and workflows.

What optional features does the API provide for data extraction?

The API offers optional confidence scores for each extracted table, indicating the reliability of the extraction. This feature helps users assess the quality of the data returned.

How does the API ensure data privacy and security?

The API is designed to be stateless and privacy-friendly, ensuring that no data is stored after processing. It uses secure HTTPS-only communication to protect user data during transmission.

What should users do if the extracted data contains empty cells?

Users can expect the API to handle empty cells gracefully, preserving the overall structure of the table. The output will reflect the original layout, allowing for straightforward data manipulation despite any missing values.

How can users interpret the confidence scores in the response?

Confidence scores range from 0 to 1, indicating the likelihood that the extracted table is accurate. A higher score suggests greater reliability, helping users determine which tables to trust for further processing.

What is the significance of the `strategyUsed` field in the response?

The `strategyUsed` field indicates the method employed by the API to extract the table data. This information can help users understand the extraction process and assess the suitability of the output for their specific needs.

General FAQs

What is Zyla API Hub?

Zyla API Hub is like a big store for APIs, where you can find thousands of them all in one place. We also offer dedicated support and real-time monitoring of all APIs. Once you sign up, you can pick and choose which APIs you want to use. Just remember, each API needs its own subscription. But if you subscribe to multiple ones, you'll use the same key for all of them, making things easier for you.

What currencies and payment methods are allowed?

Prices are listed in USD (United States Dollar), EUR (Euro), CAD (Canadian Dollar), AUD (Australian Dollar), and GBP (British Pound). We accept all major debit and credit cards. Our payment system uses the latest security technology and is powered by Stripe, one of the world's most reliable payment companies. If you have any trouble paying by card, just contact us at [email protected]

Additionally, if you already have an active subscription in any of these currencies (USD, EUR, CAD, AUD, GBP), that currency will remain for subsequent subscriptions. You can change the currency at any time as long as you don't have any active subscriptions.

Why can't I pay with my local currency even though I see it on the pricing page?

The local currency shown on the pricing page is based on the country of your IP address and is provided for reference only. The actual prices are in USD (United States Dollar). When you make a payment, the charge will appear on your card statement in USD, even if you see the equivalent amount in your local currency on our website. This means you cannot pay directly with your local currency.

My payment was declined, what should I do?

Occasionally, a bank may decline the charge due to its fraud protection settings. We suggest reaching out to your bank initially to check if they are blocking our charges. Also, you can access the Billing Portal and change the card associated to make the payment. If these does not work and you need further assistance, please contact our team at [email protected]

How will I be charged for my API subscription?

Prices are determined by a recurring monthly or yearly subscription, depending on the chosen plan.

How will my API calls be deducted from my plan?

API calls are deducted from your plan based on successful requests. Each plan comes with a specific number of calls that you can make per month. Only successful calls, indicated by a Status 200 response, will be counted against your total. This ensures that failed or incomplete requests do not impact your monthly quota.

How does your billing cycle work?

Zyla API Hub works on a recurring monthly subscription system. Your billing cycle will start the day you purchase one of the paid plans, and it will renew the same day of the next month. So be aware to cancel your subscription beforehand if you want to avoid future charges.

How do I upgrade my current subscription plan with an API?

To upgrade your current subscription plan, simply go to the pricing page of the API and select the plan you want to upgrade to. The upgrade will be instant, allowing you to immediately enjoy the features of the new plan. Please note that any remaining calls from your previous plan will not be carried over to the new plan, so be aware of this when upgrading. You will be charged the full amount of the new plan.

How can I see the remaining number of API calls I can make this month?

To check how many API calls you have left for the current month, refer to the 'X-Zyla-API-Calls-Monthly-Remaining' field in the response header. For example, if your plan allows 1,000 requests per month and you've used 100, this field in the response header will indicate 900 remaining calls.

How do I find out the maximum number of API requests allowed in my subscription plan?

To see the maximum number of API requests your plan allows, check the 'X-Zyla-RateLimit-Limit' response header. For instance, if your plan includes 1,000 requests per month, this header will display 1,000.

How do I know when my rate limit will reset?

The 'X-Zyla-RateLimit-Reset' header shows the number of seconds until your rate limit resets. This tells you when your request count will start fresh. For example, if it displays 3,600, it means 3,600 seconds are left until the limit resets.

Can I cancel anytime?

Yes, you can cancel your plan anytime by going to your account and selecting the cancellation option on the Billing page. Please note that upgrades, downgrades, and cancellations take effect immediately. Additionally, upon cancellation, you will no longer have access to the service, even if you have remaining calls left in your quota.

If I have any problems, who I should contact?

You can contact us through our chat channel to receive immediate assistance. We are always online from 8 am to 5 pm (EST). If you reach us after that time, we will get back to you as soon as possible. Additionally, you can contact us via email at [email protected]

How does the 7-day free trial work?

To give you the opportunity to experience our APIs without any commitment, we offer a 7-day free trial that allows you to make up to 50 API calls at no cost. This trial can be used only once, so we recommend applying it to the API that interests you the most. While most of our APIs offer a free trial, some may not. The trial concludes after 7 days or once you've made 50 requests, whichever occurs first. If you reach the 50 request limit during the trial, you will need to "Start Your Paid Plan" to continue making requests. You can find the "Start Your Paid Plan" button in your profile under Subscription -> Choose the API you are subscribed to -> Pricing tab. Alternatively, if you don't cancel your subscription before the 7th day, your free trial will end, and your plan will automatically be billed, granting you access to all the API calls specified in your plan. Please keep this in mind to avoid unwanted charges.

What happens if I forget to cancel my free trial?

After 7 days, you will be charged the full amount for the plan you were subscribed to during the trial. Therefore, it's important to cancel before the trial period ends. Refund requests for forgetting to cancel on time are not accepted.

How many calls can I make during the free trial?

When you subscribe to an API free trial, you can make up to 50 API calls. If you wish to make additional API calls beyond this limit, the API will prompt you to perform an "Start Your Paid Plan." You can find the "Start Your Paid Plan" button in your profile under Subscription -> Choose the API you are subscribed to -> Pricing tab.

When are Payout Orders processed?

Payout Orders are processed between the 20th and the 30th of each month. If you submit your request before the 20th, your payment will be processed within this timeframe.

Start Free Trial

Service Level

100%

Response Time

18ms

Category:

Tools & Utilities

Tags:

#Table Extraction

#PDF Processing

#Data Conversion

#Multi-page Support

#Structured Data

#Automation Integration

Related APIs

PDF Text Extractor API

The PDF to Text API is a simple solution for converting PDF files into text or words. It allows...

Tools & Utilities Free 7-Day Trial

Service Level:

91%

Response Time:

2,513ms

PDF into Text API

The PDF to Text API allows users to effortlessly convert PDF files into text or words. By utiliz...

Tools & Utilities Free 7-Day Trial

Service Level:

100%

Response Time:

0ms

Extract Text from Documents API

Seamlessly convert scanned documents into editable text using the Extract Text from Documents AP...

Visual Recognition & Imaging Free 7-Day Trial

Service Level:

100%

Response Time:

1,852ms

Text Extractor API

The TextExtractor API converts scanned images and documents into editable text, extracting and r...

Visual Recognition & Imaging Free 7-Day Trial

Service Level:

100%

Response Time:

3,190ms

Doc to Text API

Unlock the power of data with DocToText API – your ultimate solution for seamless document conve...

Data & Analytics Free 7-Day Trial

Service Level:

100%

Response Time:

0ms

Document OCR Extractor API

Extract structured data from documents with high-accuracy OCR, including personal details, dates...

Visual Recognition & Imaging Free 7-Day Trial

Service Level:

100%

Response Time:

1,852ms

Retrieve Document Text API

Effortlessly extract and retrieve text from documents with our reliable Retrieve Document Text A...

Visual Recognition & Imaging Free 7-Day Trial

Service Level:

100%

Response Time:

1,852ms

Document Data Extraction API

Streamline your workflows with our Document Data Extraction API, designed to transform any struc...

Data & Analytics Free 7-Day Trial

Service Level:

100%

Response Time:

1,529ms

Document Parser API

Automate your workflows with our state-of-the-art, highly accurate lending document parsers.

Finance & Payments

Service Level:

97%

Response Time:

616ms

Extract Text from Images API

Easily convert images into editable text with our Extract Text from Images API for seamless data...

Visual Recognition & Imaging Free 7-Day Trial

Service Level:

100%

Response Time:

748ms

PDF Table Extraction API

Key Capabilities

What This API Does

What This API Does NOT Do

Example Use Cases

Output Formats

API Characteristics

Limitations

Designed For Developers

Summary

What would you like to see? See the information or check the documentation?

API Documentation

Endpoints

API EXAMPLE RESPONSE

Extract Data - CODE SNIPPETS

API Access Key & Authentication

Questions

Simple Transparent Pricing

💫Basic

$24.99/Month

⚡Pro

$49.99/Month

🔥Pro Plus

$99.99/Month

⚜️Premium

$199.99/Month

💫Basic

$20.83/Month

⚡Pro

$41.66/Month

🔥Pro Plus

$83.33/Month

⚜️Premium

$166.66/Month

🚀 Enterprise

Starts at $ 10,000/Year

Customer favorite features

PDF Table Extraction API FAQs

What type of data does the PDF Table Extraction API return?

What are the key fields in the response data?

How is the response data organized?

What parameters can be used with the endpoint?

How is data accuracy maintained?

What are typical use cases for this data?

How can users effectively utilize the returned data?

What are standard data patterns to expect?

What types of tables can the API extract from PDFs?

How does the API handle multi-page tables?

Can users specify the output format for the extracted data?

What optional features does the API provide for data extraction?

How does the API ensure data privacy and security?

What should users do if the extracted data contains empty cells?

How can users interpret the confidence scores in the response?

What is the significance of the `strategyUsed` field in the response?

General FAQs

What is Zyla API Hub?

What currencies and payment methods are allowed?

Why can't I pay with my local currency even though I see it on the pricing page?

My payment was declined, what should I do?

How will I be charged for my API subscription?

How will my API calls be deducted from my plan?

How does your billing cycle work?

How do I upgrade my current subscription plan with an API?

How can I see the remaining number of API calls I can make this month?

How do I find out the maximum number of API requests allowed in my subscription plan?

How do I know when my rate limit will reset?

Can I cancel anytime?

If I have any problems, who I should contact?

How does the 7-day free trial work?

What happens if I forget to cancel my free trial?

How many calls can I make during the free trial?

When are Payout Orders processed?

Service Level

Response Time

Category:

Tags:

Related APIs

PDF Text Extractor API

PDF into Text API

Extract Text from Documents API

Starts at
$ 10,000/Year