Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load Data from Excel Table by Name #326

Closed
frederikb opened this issue Nov 19, 2024 · 4 comments
Closed

Load Data from Excel Table by Name #326

frederikb opened this issue Nov 19, 2024 · 4 comments

Comments

@frederikb
Copy link

Hi,

first of all thanks for sharing this library. I've previously used POI directly to load data from XLSX files and appreciate any approach to abstract away the common pain points!

My question (or probably feature suggest): is it possible to load data from a table declared in a XLSX file by name from a sheet?

As in XSSFTables which are accessed via those returned from XSSFSheet#getTables and which provides such methods such Table#getStartRowIndex, Table#getHeadersRowCount and Table#findColumnIndex

The idea being that the code to load data from the sheet would be a bit more robust against the table start row/column changing in the document.

Is this already possible or something you've considered as potentially in scope?

Thanks!

Copy link

Thank you for contributing to Poiji! Feel free to create a PR If you want to contribute directly :)

@ozlerhakan
Copy link
Owner

Hi @frederikb ,

Thank you for finding Poiji useful! Regarding the question, we do not have support XSSFTable in Poiji. If you have some samples, we could find a way to extend the tool with this feature.

Best!

@frederikb
Copy link
Author

Hi, of course!

I've modified the cars.xlsx from your repo by shifting the table slightly down and to the right and turning it into a table (select cells and press Ctrl+ / Cmd+T.

cars.xlsx

In case you have JBang installed you can run this script:

///usr/bin/env jbang "$0" "$@" ; exit $?
//DEPS org.apache.poi:poi-ooxml:5.3.0

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.*;

import java.io.FileInputStream;
import java.io.InputStream;
import java.io.IOException;

public class cars {

    public static void main(String[] args) throws IOException {
        String filename = "cars.xlsx";
        try (InputStream file = new FileInputStream(filename);
             XSSFWorkbook workbook = new XSSFWorkbook(file)) {

            XSSFTable table = workbook.getTable("Cars");

            XSSFSheet sheet = table.getXSSFSheet();

            int startRow = table.getStartRowIndex();
            int startColumn = table.getStartColIndex();
            int endRow = table.getEndRowIndex();
            int headerRowCount = table.getHeaderRowCount();
            int dataStartRow = startRow + headerRowCount;

            int nameColIndex = table.findColumnIndex("NAME");
            int yearColIndex = table.findColumnIndex("YEAR");
            int seatsColIndex = table.findColumnIndex("Seats"); // Column names are case-insensitive

            // Loop through the data rows
            DataFormatter formatter = new DataFormatter();
            for (int rowNum = dataStartRow; rowNum <= endRow; rowNum++) {
                Row row = sheet.getRow(rowNum);

                String nameValue = formatter.formatCellValue(row.getCell(startColumn + nameColIndex));
                String yearValue = formatter.formatCellValue(row.getCell(startColumn + yearColIndex));
                String seatsValue = formatter.formatCellValue(row.getCell(startColumn + seatsColIndex));

                System.out.println(nameValue + "\t" + yearValue + "\t" + seatsValue);
            }


        }
    }
}

which will print this output:

Honda Civic	2017	4
Chevrolet Corvette	2016	2

From an API perspective something like the following could be perhaps work:

@ExcelTable("Cars") // no need to even specify the sheet
public class Car {

    @ExcelCellName("NAME")
    private String name;

    @ExcelCellName("YEAR")
    private String id;

    @ExcelCellName("Seats") // case-insensitive by default if using the API from `Table`seen above
    private String phone;
}

More advanced features such as @ExcelCellName(value = "", expression = "Surname|Second name")would still require to perform your own look up such as table.getColumns().stream().filter(c -> c.getName().matches(columnNameExpression)).toList(), but you get the idea.

Looking (very briefly) at the code base you seem to be using a different API / approach for processing the workbooks, so I'm not sure how much of a help this is to you.

@ozlerhakan
Copy link
Owner

ozlerhakan commented Dec 29, 2024

Hi @frederikb ,

I had the possibility to look into this request. The thing is that the table is made of rows and columns, just like a regular excel data, which is accessible with Poiji directly. Poiji only needs an entity class like below to read information from the exact cells:

public class CarModelData {

   @ExcelCell(2)
   private String model;
   @ExcelCell(3)
   private int year;
   @ExcelCell(4)
   private int seats;

   @Override
   public String toString() {
       return "CarModelData{" +
               "model='" + model + '\'' +
               ", year=" + year +
               ", seats=" + seats +
               '}';
   }
}

and all you need to do is use headerCount to point out the right position of the table in the sheet:

PoijiOptions options = PoijiOptions.PoijiOptionsBuilder.settings().headerCount(3).build();
List<CarModelData> models = Poiji.fromExcel(new File(path), CarModelData.class, options);
for (CarModelData model : models) {
    System.out.println(model);
}

The output:

CarModelData{model='Honda Civic', year=2017, seats=4}
CarModelData{model='Chevrolet Corvette', year=2016, seats=2}

I hope it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants