-
Notifications
You must be signed in to change notification settings - Fork 7
Data post elaboration pipeline and merged regions
In some cases it's useful to have a way to make a data elaboration after the export file is generated. A good example could be the creation of merged regions.
For this reason MemPOI introduces the Data post elaboration system
. The main concept resides in the list of MempoiColumnElaborationStep
added to the MempoiColumn
class.
The elaboration consists of 2 phases: analyzing data and applying transformation based on previously collected data. This is the working process:
- after each row is added to each sheet -> analyze and collect data
- after the last row is added to each sheet -> close analysis making some final operations
- after data export completion -> apply data transformations
You can create your own Data post elaboration system
's implementation by 2 ways:
- implementing the base interface
MempoiColumnElaborationStep
- extending the abstract class
StreamApiElaborationStep
This represents the base functionality and defines the methods you should implement to manage your desired data post elaboration flow.
You can find an example in NotStreamApiMergedRegionsStep
.
This class supplies some basic implementations to deal with Apache POI stream API.
Then you have to implement, as for MempoiColumnElaborationStep
, the interface logic methods.
You can find an example in StreamApiMergedRegionsStep
.
The main difference resides in the underlying Apache POI system, so it is a good practice to use the right implementation depending on the used Workbook
implementation.
However we could list some behaviors:
MempoiColumnElaborationStep
- it should be used with
HSSF
orXSSF
- it should access the generated
Workbook
as all in memory => document too large could saturate your memory causing an error - memory is never flushed
StreamApiElaborationStep
- it should be used with
SXSSF
- it should access only a portion of the generated
Workbook
keeping in mind that at each time only a subset of the created rows are loaded in memory - you could find your desired configuration for the workbook's
RandomAccessWindowSize
property or you could try with its default value. - memory is flushed in order to keep only a subset of the generated rows in memory
- memory flush mechanism is automated but it is a fragile mechanism, as reported by Apache POI doc, so it has to be used carefully
You can add as many steps as you want as follows:
MempoiSheetBuilder.aMempoiSheet()
.withSheetName("Multiple steps")
.withPrepStmt(prepStmt)
.withDataElaborationStep("name", step1)
.withDataElaborationStep("usefulChar", step2)
.withDataElaborationStep("name", step3);
Note that you can add more than one step on each column. Keep in mind that order matters: for each column, steps will be executed in the added order so be careful. Built-in steps (like Merged Regions) will be added firstly. If you want to change this behavior you could configure them without using built-in functionalities.
For example both the following codes will result in executing merged regions step and then the custom one:
MempoiSheetBuilder.aMempoiSheet()
.withSheetName("Multiple steps")
.withPrepStmt(prepStmt)
.withMergedRegionColumns(new String[]{"name"})
.withDataElaborationStep("name", customStep);
MempoiSheetBuilder.aMempoiSheet()
.withSheetName("Multiple steps")
.withPrepStmt(prepStmt)
.withDataElaborationStep("name", customStep)
.withMergedRegionColumns(new String[]{"name"});
But this one will execute firstly the custom step and then the merged regions one:
MempoiSheetBuilder.aMempoiSheet()
.withSheetName("Multiple steps")
.withPrepStmt(prepStmt)
.withDataElaborationStep("name", customStep)
.withDataElaborationStep("name", new NotStreamApiMergedRegionsStep<>(columnList.get(colIndex).getCellStyle(), colIndex));
Currently MemPOI supplies only one Data post elaboration system
's step in order to ease merged regions management.
All you have to do is to pass a String array to the MempoiSheetBuilder
representing the list of columns to merge.
String[] mergedColumns = new String[]{"name"};
MempoiSheet mempoiSheet = MempoiSheetBuilder.aMempoiSheet()
.withSheetName("Merged regions name column 2")
.withPrepStmt(prepStmt)
.withMergedRegionColumns(mergedColumns)
.withStyleTemplate(new RoseStyleTemplate())
.build();
MemPOI memPOI = MempoiBuilder.aMemPOI()
.withFile(fileDest)
.withStyleTemplate(new ForestStyleTemplate())
.withWorkbook(new HSSFWorkbook())
.addMempoiSheet(mempoiSheet)
.build();
memPOI.prepareMempoiReport().get();