Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write-through support for memory-mapping #81

Merged
merged 7 commits into from
Aug 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
python-version: ["3.9", "3.10", "3.11", "3.12"]
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
Expand Down
2 changes: 1 addition & 1 deletion .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ sphinx:
build:
os: ubuntu-20.04
tools:
python: "3.8"
python: "3.12"
jobs:
post_install:
# See https://github.com/pdm-project/pdm/discussions/1365
Expand Down
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,17 @@
All notable changes to this project will be documented here.

## [Unreleased]
### Added
- Official support for NumPy 2.0.
- Support for write-through memory-mapping. Thanks to @nh2 and
@chpatrick for the original implementation.

### Fixed
- A small unit test bug.

### Removed
- Official support for Python 3.8.
- Official support for NumPy < 1.21.

## [1.0.3] - 2024-01-06
### Fixed
Expand Down
33 changes: 26 additions & 7 deletions doc/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,8 @@ Concretely:
```Python Console
>>> plydata.elements[0].name
'vertex'
>>> plydata.elements[0].data[0]
(0., 0., 0.)
>>> plydata.elements[0].data[0].tolist()
(0.0, 0.0, 0.0)
>>> plydata.elements[0].data['x']
array([0., 0., 1., 1.], dtype=float32)
>>> plydata['face'].data['vertex_indices'][0]
Expand All @@ -91,8 +91,8 @@ and elements can be indexed directly without explicitly going through
the `data` attribute:

```Python Console
>>> plydata['vertex'][0]
(0., 0., 0.)
>>> plydata['vertex'][0].tolist()
(0.0, 0.0, 0.0)
>>>
```

Expand Down Expand Up @@ -138,21 +138,40 @@ two cases to consider.

If an element in a binary PLY file has no list properties, then it will
be memory-mapped by default, subject to the capabilities of the
underlying file object. Memory mapping can be disabled using the
`mmap` argument:
underlying file object.
- Memory mapping can be disabled or fine-tuned using the `mmap` argument
of `PlyData.read`.
- To confirm whether a given element has been memory-mapped or not,
check the type of `element.data`.

This is all illustrated below:

```Python Console
>>> plydata.text = False
>>> plydata.byte_order = '<'
>>> plydata.write('tet_binary.ply')
>>>
>>> # `mmap=True` is the default:
>>> # Memory-mapping is enabled by default.
>>> plydata = PlyData.read('tet_binary.ply')
>>> isinstance(plydata['vertex'].data, numpy.memmap)
True
>>> # Any falsy value disables memory-mapping here.
>>> plydata = PlyData.read('tet_binary.ply', mmap=False)
>>> isinstance(plydata['vertex'].data, numpy.memmap)
False
>>> # Strings can also be given to fine-tune memory-mapping.
>>> # For example, with 'r+', changes can be written back to the file.
>>> # In this case, the file must be explicitly opened with read-write
>>> # access.
>>> with open('tet_binary.ply', 'r+b') as f:
... plydata = PlyData.read(f, mmap='r+')
>>> isinstance(plydata['vertex'].data, numpy.memmap)
True
>>> plydata['vertex']['x'] = 100
>>> plydata['vertex'].data.flush()
>>> plydata = PlyData.read('tet_binary.ply')
>>> all(plydata['vertex']['x'] == 100)
True
>>>
```

Expand Down
1,013 changes: 503 additions & 510 deletions pdm.lock

Large diffs are not rendered by default.

22 changes: 12 additions & 10 deletions plyfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,18 +128,17 @@ def _parse_header(stream):
)

@staticmethod
def read(stream, mmap=True, known_list_len={}):
def read(stream, mmap='c', known_list_len={}):
"""
Read PLY data from a readable file-like object or filename.

Parameters
----------
stream : str or readable open file
mmap : bool, optional (default=True)
Whether to allow element data to be memory-mapped when
possible. The default is `True`, which allows memory
mapping. Using `False` will prevent memory mapping.

mmap : {'c', 'r', 'r+'} or bool, optional (default='c')
Configures memory-mapping. Any falsy value disables
memory mapping, and any non-string truthy value is
equivalent to 'c', for copy-on-write mapping.
known_list_len : dict, optional
Mapping from element names to mappings from list property
names to their fixed lengths. This optional argument is
Expand Down Expand Up @@ -507,7 +506,7 @@ def _read(self, stream, text, byte_order, mmap,
stream : readable open file
text : bool
byte_order : {'<', '>', '='}
mmap : bool
mmap : {'c', 'r', 'r+'} or bool
known_list_len : dict
"""
if text:
Expand All @@ -519,7 +518,9 @@ def _read(self, stream, text, byte_order, mmap,
if mmap and _can_mmap(stream) and can_mmap_lists:
# Loading the data is straightforward. We will memory
# map the file in copy-on-write mode.
self._read_mmap(stream, byte_order, known_list_len)
mmap_mode = mmap if isinstance(mmap, str) else 'c'
self._read_mmap(stream, byte_order, mmap_mode,
known_list_len)
else:
# A simple load is impossible.
self._read_bin(stream, byte_order)
Expand Down Expand Up @@ -549,14 +550,15 @@ def _write(self, stream, text, byte_order):
stream.write(self.data.astype(self.dtype(byte_order),
copy=False).data)

def _read_mmap(self, stream, byte_order, known_list_len):
def _read_mmap(self, stream, byte_order, mmap_mode, known_list_len):
"""
Memory-map an input file as `self.data`.

Parameters
----------
stream : readable open file
byte_order : {'<', '>', '='}
mmap_mode: str
known_list_len : dict
"""
list_len_props = {}
Expand All @@ -581,7 +583,7 @@ def _read_mmap(self, stream, byte_order, known_list_len):
if max_bytes < num_bytes:
raise PlyElementParseError("early end-of-file", self,
max_bytes // dtype.itemsize)
self._data = _np.memmap(stream, dtype, 'c', offset, self.count)
self._data = _np.memmap(stream, dtype, mmap_mode, offset, self.count)
# Fix stream position
stream.seek(offset + self.count * dtype.itemsize)
# remove any extra properties added
Expand Down
5 changes: 2 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ description = "PLY file reader/writer"
authors = [
{name = "Darsh Ranjan", email = "dranjan@berkeley.edu"},
]
dependencies = ["numpy>=1.17"]
requires-python = ">=3.8"
dependencies = ["numpy>=1.21"]
requires-python = ">=3.9"
readme = "README.md"
license = {file = "COPYING"}
keywords = ["ply", "numpy"]
Expand All @@ -33,7 +33,6 @@ classifiers = [
"License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
Expand Down
17 changes: 17 additions & 0 deletions test/test_plyfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -376,15 +376,32 @@ def test_memmap(tmpdir, tet_ply_txt):

def test_copy_on_write(tmpdir, tet_ply_txt):
ply0 = tet_ply_txt
ply0.text = False
filename = str(tmpdir.join('test.ply'))
ply0.write(filename)
ply1 = PlyData.read(filename)
assert isinstance(ply1['vertex'].data, numpy.memmap)
ply1['vertex']['x'] += 1
ply2 = PlyData.read(filename)

verify(ply0, ply2)


def test_memmap_rw(tmpdir, tet_ply_txt):
ply0 = tet_ply_txt
ply0.text = False
filename = str(tmpdir.join('test.ply'))
ply0.write(filename)
with open(filename, 'r+b') as f:
ply1 = PlyData.read(f, mmap='r+')
assert isinstance(ply1['vertex'].data, numpy.memmap)
ply1['vertex']['x'][:] = 100
ply1['vertex'].data.flush()
ply2 = PlyData.read(filename)

assert (ply2['vertex']['x'] == 100).all()


def test_write_invalid_filename(tet_ply_txt):
with Raises(TypeError) as e:
tet_ply_txt.write(None)
Expand Down
6 changes: 2 additions & 4 deletions tox.ini
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
[tox]
envlist = global-init,py38-numpy1.{17,19,21,22,23,24},py39-numpy1.{17,19,21,22,23,24,25},py310-numpy1.{21,22,23,24,25},py311-numpy1.{23,24,25},py312-numpy1.26,global-finalize
envlist = global-init,py39-numpy{1.21,1.22,1.23,1.24,1.25,1.26,2.0},py310-numpy{1.21,1.22,1.23,1.24,1.25,1.26,2.0},py311-numpy{1.23,1.24,1.25,1.26,2.0},py312-numpy{1.26,2.0},global-finalize

[gh-actions]
python =
3.8: py38
3.9: py39
3.10: py310
3.11: py311
Expand All @@ -21,14 +20,13 @@ usedevelop = True
deps =
pytest
pytest-cov
numpy1.17: numpy>=1.17,<1.18
numpy1.19: numpy>=1.19,<1.20
numpy1.21: numpy>=1.21,<1.22
numpy1.22: numpy>=1.22,<1.23
numpy1.23: numpy>=1.23,<1.24
numpy1.24: numpy>=1.24,<1.25
numpy1.25: numpy>=1.25,<1.26
numpy1.26: numpy>=1.26,<1.27
numpy2.0: numpy>=2.0,<2.1
setenv =
COVERAGE_FILE = {toxworkdir}/.coverage.{envname}
commands = py.test test -v --cov=plyfile
Expand Down
Loading