Read Data From Excel using Apache POI jars

Excel files (spreadsheets) are widely used by people all over the world for various tasks related to organization, analysis, and storage of tabular data.

Since excel files are so common, we developers often encounter use-cases when we need to read data from an excel file or generate a report in excel format.

In this article, I’ll show you how to read excel files in Java using a very simple yet powerful open source library called Apache POI.


Let’s get started!


Dependencies


First of all, We need to add the required dependencies for including Apache POI in our project. If you use maven, you need to add the following dependencies to your pom.xml file

<!-- https://mvnrepository.com/artifact/org.apache.poi/poi -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>3.17</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.poi/poi-ooxml -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>3.17</version>
</dependency>

The first dependency poi is used to work with the old Microsoft’s binary file format for excel. These file formats have .xls extension.

The second dependency poi-ooxml is used to work with the newer XML based file format. These file formats have .xlsx extension.


Sample Excel file that We’ll read


Following is a sample excel file that we’ll read in our code. It is created using Google Sheets and has .xlsx extension.

Note that, Although the sample file is of the newer XML based file format (.xlsx). The code that we’ll write will work with both types of file formats - .xls and .xlsx


Apache POI terminologies


Apache POI excel library revolves around following four key interfaces -
  1. Workbook: A workbook is the high-level representation of a Spreadsheet.
  2. Sheet: A workbook may contain many sheets. The sample excel file that we looked at in the previous section has two sheets - Employee and Department
  3. Row: As the name suggests, It represents a row in the spreadsheet.
  4. Cell: A cell represents a column in the spreadsheet.


Program to Read an excel file using Apache POI


The following program shows you how to read an excel file using Apache POI. Since we’re not using any file format specific POI classes, the program will work for both types of file formats - .xls and .xlsx.

The program shows three different ways of iterating over sheets, rows, and columns in the excel file

import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
 public class DatafromExcel {
     public static void main(String[] args) throws Exception{
       // Initialize fileInputStream and set path of excel file
       InputStream is = new FileInputStream("src/DATA.xls");
        // Create workbook and access file into Workbook
         Workbook wb = WorkbookFactory.create(is);
         // Get sheet by sending sheet id
         Sheet s = wb.getSheetAt(0);
         // Get row count
         int rc = s.getLastRowNum();
         // Loop for every Row
         for (int j = 0; j < rc + 1; j++) {
                  // object for Row and pass which row need to be accessed
                     Row r = s.getRow(j);
                      // Get Column count
                       int cc = r.getLastCellNum();
                       // Loop to Every cell
                      for (int k = 0; k < cc; k++) {
                           // Get Cell data
                           String cellData = r.getCell(k).getStringCellValue();
                            // Print cell data
                             System.out.print(cellData + " ");
                          }
                          // New Line
                            System.out.println();
                          }
           }
}

Output :
USER NAME PASSWORD
krishna gt1234
reddy selenium


Retrieving Cell values by CellType

Instead of using a DataFormatter to format and get each cell’s value as String regardless of the Cell type, You may check each cell’s type and then retrieve its value using various type-specific methods like this -

private static void printCellValue(Cell cell) {
    switch (cell.getCellTypeEnum()) {
        case BOOLEAN:
            System.out.print(cell.getBooleanCellValue());
            break;
        case STRING:
            System.out.print(cell.getRichStringCellValue().getString());
            break;
        case NUMERIC:
            if (DateUtil.isCellDateFormatted(cell)) {
                System.out.print(cell.getDateCellValue());
            } else {
                System.out.print(cell.getNumericCellValue());
            }
            break;
        case FORMULA:
            System.out.print(cell.getCellFormula());
            break;
        case BLANK:
            System.out.print("");
            break;
        default:
            System.out.print("");
    }
    System.out.print("\t");
}