
Table extraction is the task of detecting the tables within the document and extracting them into a structured output that can be consumed by workflow applications such as robotic process automation (RPA) services, data analyst tools such as excel, databases and search services.Ĭustomers often use manual processes for data extraction and digitization. Tables in documents are often the most important part of the document but extracting data from tables in documents presents a unique set of challenges. Challenges include an accurate detection of the tabular region within an image, and subsequently detecting and extracting information from the rows and columns of the detected table, merged cells, complex tables, nested tables and more. Tables are often found in financial documents, legal documents, insurance documents, oil and gas documents and more. Authors: Lei Sun, Neta Haiby, Cha Zhang, Sanjeev Jagtapĭocuments containing tables pose a major hurdle for information extraction.
