WorkbookWrapper

(Implemented at xlsx_copycull.xlsx_copycull.WorkbookWrapper but automatically imported as a top-level class, xlsx_copycull.WorkbookWrapper.)

Note

By default, creating a WorkbookWrapper will automatically copy the target spreadsheet, and the original spreadsheet will not be modified. However, this behavior can be suppressed at initialization with no_copy=True, in which case no copy will be generated, and and changes will be made to the original file instead.

class xlsx_copycull.WorkbookWrapper(orig_fp: Path, copy_fp: Optional[Path] = None, uid=None, no_copy=False)

A wrapper class for openpyxl workbooks, with added methods for generating modified copies (e.g., reducing to only the rows relevant to a portion of the data).

By design, this will leave the original spreadsheet alone and will generate a copy to modify.

In particular, look into the .cull() and .add_formulas() methods of the subordinate WorksheetWrapper objects (which get stored in the .ws_dict attribute).

Before modifying any worksheet in the wrapped workbook with the added methods, you MUST stage it with the .stage_ws() method, specifying its name, the row containing the header (defaults to 1), and various optional parameters, such as which rows to leave alone (this will create a WorksheetWrapper object).

Access the staged WorksheetWrapper objects either directly in the .ws_dict attribute (a dict, keyed by sheet name), or by subscripting on the WorkbookWrapper object (passing the sheet name):

``some_wb_wrapper.ws_dict['Sheet1'].cull(<...>)``

    ...is equivalent to...

``some_wb_wrapper['Sheet1'].cull(<...>)``

(Remember, though, that worksheets must first be staged with .stage_ws(), or this would raise a KeyError.)

Warning

As with any script that uses openpyxl to modify spreadsheets, any formulas that exist in the original spreadsheet will most likely NOT survive the insertion or deletion of rows or columns (or changing of worksheet names, etc.). Thus, it is highly recommended that you flatten all possible formulas, and use the .add_formulas() method in the WorksheetWrapper class to the extent possible for your use case.

__getitem__(item)

WorkbookWrapper objects are limitedly subscriptable, in that the WorksheetWrapper objects stored in .ws_dict can be accessed by the corresponding dict key (sheet name):

wb_wrapper = xlsx_copycull.WorkbookWrapper(...)
wb_wrapper.stage_ws('Sheet1', ...)

sheet_wrapper = wb_wrapper['Sheet1']
# ...is equivalent to...
sheet_wrapper = wb_wrapper.ws_dict['Sheet1']
Parameters

item

Returns

__init__(orig_fp: Path, copy_fp: Optional[Path] = None, uid=None, no_copy=False)

A wrapper for an openpyxl Workbook object. Access the Workbook object directly in the .wb attribute. The Workbook will be loaded at init.

Note

.wb will be set to None if the file is not currently open. Open it with the .load_wb() method, close it with .close_wb() (which will NOT save by default), and check whether it is currently open with the .is_loaded property.

Parameters
  • orig_fp – Filepath to the workbook to load (and copy from). Must be in the .xlsx or .xlsm formats!

  • copy_fp – Filepath at which to save the copied workbook. The filename should end in '.xlsx' or '.xlsm'.

  • uid – (Optional) An internal unique identifier.

  • no_copy

    Use this to modify the original spreadsheet, without copying. By default, the original spreadsheet will be copied to the filepath at copy_fp (i.e. no_copy=False).

    Warning

    Using no_copy will irrevocably modify the original spreadsheet.

close_wb() None

Close the workbook, and inform the subordinates that they cannot be modified until the workbook is reopened with .load_wb(). :return: None

copy_original(fp=None, stage_new_fp=False) None

Copy the source spreadsheet to the new filepath at fp, and store that new filepath to .copy_fp. (If fp is not specified here, will default to whatever is already set in .copy_fp.)

Parameters
  • fp – The filepath to copy to.

  • stage_new_fp

    A bool, whether to set the filepath of the newly copied workbook as the target workbook of this WorkbookWrapper object. That is, whether the newly copied spreadsheet is the one we want to be working on. Defaults to False.

    Note

    If the workbook is currently open and stage_new_fp=True is passed, it will raise a RuntimeError. To avoid that error, save and close the workbook first:

    workbook_wrapper.save_wb()
    workbook_wrapper.close_wb()
    

Returns

None

delete_ws(ws_name)

Delete a worksheet from the workbook. (The worksheet need not be staged.)

Parameters

ws_name – The name of the worksheet to discard.

Returns

None

load_wb(**_load_workbook_kwargs)

Open the workbook at the filepath stored in .copy_fp (and behind the scenes, inform all subordinate worksheets that they are now open for modification – by setting their .ws attributes to the appropriate openpyxl worksheet object).

Parameters

_load_workbook_kwargs

(Optional, unsupported) Keyword arguments to pass through to the openpyxl.load_workbook() method. See openpyxl’s documentation for optional parameters.

Warning

load_workbook_kwargs is not strictly supported by the xlsx_copycull module. You may run into unexpected behavior or errors.

Returns

None

mandate_loaded()

Raise an error if the .wb is not currently loaded.

rename_ws(old_name, new_name)

Rename a worksheet. (Workbook must be open, and worksheet with old_name must already be staged.)

Note that renaming the worksheet will also modify the corresponding .ws_dict key:

wb_wrapper_obj = xlsx_copycull.WorkbookWrapper(<...>)
wb_wrapper_obj.load_wb()
wb_wrapper_obj.stage_ws('Sheet1', <...>)
ws_wrapper1 = wb_wrapper_obj.ws_dict['Sheet1']  # OK
ws_wrapper1 = wb_wrapper['Sheet1']  # OK
wb_wrapper_obj.rename_ws('Sheet1', 'Prices')
ws_wrapper1 = wb_wrapper.ws_dict['Prices']  # new sheet name
ws_wrapper1 = wb_wrapper['Prices']  # new sheet name
ws_wrapper1 = wb_wrapper['Sheet1']  # raises KeyError.
save_wb(fp=None) None

Save the .xlsx or .xlsm file.

Parameters

fp – The filepath at which to save the workbook. If not specified here, will save to the path currently configured in the .copy_fp attribute.

Returns

None

stage_ws(ws_name, header_row: int = 1, first_modifiable_row: int = -1, protected_rows: Optional[set] = None, rename_ws: Optional[str] = None)

Prepare a worksheet for modification.

Parameters
  • ws_name – The (original) sheet name.

  • header_row – The row containing headers (an int, indexed to 1)

  • first_modifiable_row – (Optional) The first row that may be modified (an int, indexed to 1). If not set, will default to the first row after the header_row.

  • protected_rows – A list-like object containing the rows that should never be deleted. Row numbers before first_modifiable_row and the header row will be automatically added to the rows that may not be deleted.

  • rename_ws

    (Optional) A string, for how to rename the worksheet. Defaults to None, in which case, it will not be renamed.

    Warning

    If the worksheet is renamed, the new name will be the key for this worksheet, and NOT the original worksheet name.

Returns

The WorksheetWrapper object for the newly staged sheet (which is also stored to .ws_dict, keyed by the sheet name).