Releases: devitocodes/devito
Releases · devitocodes/devito
v4.7.1
v4.7.0
Changes
- compiler: Fix checkpoint size when save != None @speglich (#1906)
- compiler: Revamp generation of implicit equations @FabioLuporini (#1908)
API
- compiler: Support interpolation with user-provided implicit dims @FabioLuporini (#1948)
- api: Add op.cinterface for C-level interoperability @FabioLuporini (#1843)
Examples
- examples: Fix grammar and spelling mistakes in pml example @PershingSquare (#1930)
- examples: Update MPI tutorial notebook and scripts @georgebisbas (#1923)
- examples: Use solve instead of by hand derivation @mloubout (#1879)
- examples: Rename the viscoacoustic equations @nogueirapeterson (#1869)
- examples: Update attribute names and backend for synthetics notebook @EdCaunt (#1895)
- examples: Add colab/wsl dependencies to gempy notebook @EdCaunt (#1866)
- Fix 2D Gzz when sintheta==0 @mloubout (#1855)
Documentation
- misc: Documentation tweaks @FabioLuporini (#1853)
- misc: Add citation button @georgebisbas (#1848)
Compiler
- compiler: Generalize lowering of reductions @FabioLuporini (#1980)
- compiler: Restructure basic.Object sub-hierarchy @FabioLuporini (#1977)
- compiler: Infer type generation from _C_ctype @FabioLuporini (#1971)
- compiler: Reuse twin elemental functions @FabioLuporini (#1967)
- compiler: Support for MPI+CUDA @FabioLuporini (#1965)
- compiler: Remove IndexedData.free_symbols hack @FabioLuporini (#1958)
- compiler: Decouple pthreading pass from orchestration @FabioLuporini (#1938)
- compiler: Inception of two-stage derivative evaluation @FabioLuporini (#1893)
- compiler: Misc compiler improvements @FabioLuporini (#1912)
- compiler: Uncache data carriers @FabioLuporini (#1913)
- compiler: Misc minor code generation fixes @FabioLuporini (#1891)
- compiler: Minor FD refactorings @FabioLuporini (#1875)
- compiler: Minor blocking.py refactorings @georgebisbas (#1845)
- compiler: Avoid generating unnecessary conditionals @FabioLuporini (#1880)
- compiler: Move derivative.evaluate @FabioLuporini (#1872)
- compiler: Drop unnecessary check_indices @FabioLuporini (#1871)
- compiler: Add GuardOverflow @FabioLuporini (#1870)
- compiler: Flip par-tile unrolling @FabioLuporini (#1868)
- compiler: Relax Data.array_finalize to work with subclasses @FabioLuporini (#1865)
- compiler: Add SEPARABLE property @FabioLuporini (#1840)
MPI
- mpi: Fix data distribution bugs [part 1] @rhodrin (#1947)
- mpi: Always generate MPIComm with MPI enabled @FabioLuporini (#1905)
- mpi: Do not send padding region when doing pythonland halo exchange @tjb900 (#1874)
GPU
- install: Create scheduled base docker images for better build @mloubout (#1946)
- [gpu] Update to HPC SDK 22.2 and CUDA 11.6 @kenhester (#1842)
Architectures and JIT
- arch: Support OpenMP on Apple M1 @mloubout (#1931)
- compiler: Drop ARM specialization @FabioLuporini (#1882)
- arch: Fix CustomCompiler instantiation @AtilaSaraiva (#1850)
- arch: Add lineinfo compiler flag @FabioLuporini (#1849)
🐛 Bug Fixes
- compiler: Patch premature lowering of EvalDerivative @FabioLuporini (#1979)
- mpi: Fix data distribution bugs [part 1] @rhodrin (#1947)
- api: Fix Derivative substitution @mloubout (#1942)
- compiler: Patch fission (issue 1921) @FabioLuporini (#1922)
- compiler: Patch buffer initialization @FabioLuporini (#1916)
- compiler: Patch generate_implicit for corner cases @FabioLuporini (#1910)
- compiler: Patch race conditions due to storage-related dependencies @FabioLuporini (#1903)
- compiler: Patch iteration space scheduling @FabioLuporini (#1898)
- compiler: Fix guards fusion @FabioLuporini (#1883)
- compiler: Do not evaluate unevaluable expressions @mloubout (#1878)
- compiler: Patch norm accumulation @FabioLuporini (#1864)
- compiler: Patch norm enforcing double precision accumulation @FabioLuporini (#1861)
- compiler: Use long, not unsigned int, for linearization @FabioLuporini (#1841)
Benchmarking
- compiler: Add blockrelax tests and refresh advisor profiling @georgebisbas (#1929)
Testing
- tests: Refactor conftest skipif names @georgebisbas (#1960)
- install: Create scheduled base docker images for better build @mloubout (#1946)
- tests: Add MFE for issue 1753 @georgebisbas (#1873)
- CI: prevent conflict with pytest @mloubout (#1844)
Continuous Integration
- install: Update to intel-oneapi-mpi-devel @georgebisbas (#1961)
- install: Create scheduled base docker images for better build @mloubout (#1946)
- ci: Hotfix for actions-gh-pages @georgebisbas (#1937)
- ci: Hotfix actions-gh-pages @georgebisbas (#1936)
- ci: Update several GitHub actions versions @georgebisbas (#1934)
Installation
- install: Patch Dockerfile.nvidia @FabioLuporini (#1975)
- docker: Fix docker tags for publish @mloubout (#1982)
- install: Add more base images to Docker.nvidia @FabioLuporini (#1974)
- install: Reinstate libgl1-mesa-glx @georgebisbas (#1966)
- install: Add lib/intel64_lin @georgebisbas (#1963)
- install: Refactor Intel paths @georgebisbas (#1962)
- pip prod(deps): update ipyparallel requirement from <8.4 to <8.5 @dependabot (#1945)
- pip prod(deps): update distributed requirement from <2022.7 to <2022.8 @dependabot (#1955)
- install: Create scheduled base docker images for better build @mloubout (#1946)
- docker: Hotfix Dockerfile.nvidia @georgebisbas (#1940)
- gpu: Update to HPC 22.3, reduced image size @kenhester (#1918)
- pip prod(deps): update distributed requirement from <2022.6 to <2022.7 @dependabot (#1935)
- deps: Drop deprecated distutils imports @georgebisbas (#1899)
- pip prod(deps): update distributed requirement from <2022.5 to <2022.6 @dependabot (#1917)
- pip prod(deps): update distributed requirement from <2022.4 to <2022.5 @dependabot (#1894)
- CI-Docker: Publish CPU image with tag latest @navjotk (#1890)
- pip prod(deps): update distributed requirement from <2022.3 to <2022.4 @dependabot (#1884)
- sympy: new version compat @mloubout (#1858)
- install: Fix cflag for nvidia mpi4py @mloubout (#1854)
- install: Fix EOL typo @mloubout (#1852)
- Installation: update setup.py and cleanup docker files @mloubout (#1847)
- pip prod(deps): update distributed requirement from <=2022.2 to <2022.3 @dependabot (#1851)
v4.6.2
Changes
Documentation
- misc: update pypi @georgebisbas (#1837)
Compiler
- compiler: Augment code generation capabilities for CUDA/HIP/SYCL support @FabioLuporini (#1828)
GPU
🐛 Bug Fixes
- compiler: Patch linearization pass @FabioLuporini (#1839)
v4.6.1
Changes
API
Examples
- benchmarks: custom click type for grid params @mloubout (#1832)
- examples: Add nonzero example to ConditionalDimension tutorial @georgebisbas (#1820)
- examples: add adjoint, born, gradient, checkpointing to the tti example @mloubout (#1809)
- examples: Improve tti_pure_wave_eq tutorial @ofmla (#1779)
- examples: Add tti_pure_wave_eq tutorial @ofmla (#1752)
Compiler
- compiler: CUDA/HIP/SYCL preliminaries + misc improvements @FabioLuporini (#1819)
- compiler: Refactorings, simplifications, generalizations @FabioLuporini (#1810)
- compiler: gpu cc detection fix @georgebisbas (#1814)
- compiler: Evaluate MIN/MAX expressions with assumptions @georgebisbas (#1798)
- compiler: Move lambda level to IncrDimension property @georgebisbas (#1793)
- compiler: Further minor tweaks @FabioLuporini (#1796)
- compiler: Nested indexification @mloubout (#1789)
- compiler: Minor patches @FabioLuporini (#1784)
- compiler: Make Injection/Interpolation part of sympy hierarchy @FabioLuporini (#1782)
- compiler: Refactor subdomain hierarchy @FabioLuporini (#1781)
- compiler: Enable specialization of Function.data @FabioLuporini (#1778)
- compiler: Add evalmin, evalmax utilities @georgebisbas (#1777)
- compiler: Simplifications @FabioLuporini (#1773)
MPI
- mpi: Fix mask ordering for sparse gather @mloubout (#1824)
- mpi: Add MPI support for python3.9 @georgebisbas (#1790)
GPU
- gpu: move blocking pass from custom to advanced mode @italoaug (#1818)
- gpu: gpu cc detection fix @georgebisbas (#1814)
- gpu: add nvidia gpu compute capability auto-detection @georgebisbas (#1803)
- gpu: Loop tiling for GPU @italoaug (#1801)
- gpu, compiler: Update HPC SDK 21.9 @kenhester (#1765)
Architectures and JIT
- arch: Support OSX+M1 @FabioLuporini (#1822)
- compiler: add nvidia gpu compute capability auto-detection @georgebisbas (#1803)
🐛 Bug Fixes
- examples: Change stability testing sizes to avoid domain overlap with MPI @mloubout (#1772)
- compiler: Fixes #1695 by prioritising innermost vectorizable candidates @georgebisbas (#1697)
Continuous Integration
- ci: Update codecov action version @mloubout (#1825)
- ci: Fix OSX setup in tutorials @mloubout (#1807)
- ci: Use conda with python 3.8 (defaults to 3.10 otherwise) @FabioLuporini (#1775)
Installation
- misc: Add requirements.txt and requirements-optional.txt to MANIFEST.in @hmeiland (#1835)
- reqs: version check for distributed @georgebisbas (#1830)
- pip prod(deps): update distributed requirement from <2021.13 to <2022.2 @dependabot (#1821)
- pip prod(deps): update distributed requirement from <2021.12 to <2021.13 @dependabot (#1805)
- ci: Add python 3.10, gcc-10 option @georgebisbas (#1795)
- pip prod(deps): update distributed requirement from <2021.11 to <2021.12 @dependabot (#1794)
- reqs: Extend SymPy support to 1.9 @mloubout (#1786)
- pip prod(deps): update distributed requirement from <2021.10 to <2021.11 @dependabot (#1785)
- install: Align docker for cpu and nvidia @hmeiland (#1758)
v4.6
Changes
API
Examples
- examples: enforce stable space order for self adjoint op @mloubout (#1747)
- tests: add tti_setup to gradientJ test @ofmla (#1740)
Compiler
- compiler: Add machinery for custom memory allocators and MPI @FabioLuporini (#1764)
- compiler: lift skewing in higher block levels @georgebisbas (#1735)
- compiler: Loop fission @FabioLuporini (#1732)
- compiler: improve HB generated code @georgebisbas (#1731)
- compiler: Introduce linearization pass @FabioLuporini (#1727)
- compiler: Introducing min/max bounds to replace 'bf' elemental functions @georgebisbas (#1673)
MPI
- mpi: Speedup index_glb_to_loc @FabioLuporini (#1748)
- mpi: Mitigate SparseFunction setup costs @mloubout (#1720)
GPU
- compiler: Add optimization option to fuse WithLocks tasks @FabioLuporini (#1736)
🐛 Bug Fixes
- mpi: Patch neighborhood construction @FabioLuporini (#1768)
- compiler: Patch SubDomainSet with NVC @FabioLuporini (#1767)
- compiler: Patch and improve SubDomainSet @FabioLuporini (#1762)
- compiler: Fix zero to zero slices @rhodrin (#1757)
- bench: Patch jacobian operators + MPI (see issue #1744) @FabioLuporini (#1745)
Benchmarking
- bench: Patch jacobian operators + MPI (see issue #1744) @FabioLuporini (#1745)
- bench: Add warmup option to run mode @FabioLuporini (#1742)
Continuous Integration
Installation
- install: Udpate to HPCSDK 21.7, Update to Jupyter>=3.0 @FabioLuporini (#1760)
- pip prod(deps): update distributed requirement from <2021.9 to <2021.10 @dependabot (#1749)
- pip prod(deps): update distributed requirement from <2021.8 to <2021.9 @dependabot (#1737)
- reqs: enforce pip>=21.1.2 for conda env installation @georgebisbas (#1734)
- reqs: pip new file arg format @georgebisbas (#1733)
Misc
- misc: git ignore *.npy files @georgebisbas (#1729)
v4.5
Changes
API
- dsl: Generalised MatrixSparseTimeFunction @tjb900 (#1719)
- dsl: Improve support for running operators concurrently in python threads @tjb900 (#1708)
Examples
- examples: Add viscoacoustic Born operator to 1st sls equation @nogueirapeterson (#1690)
- examples: Switch to adjoint time derivative @mloubout (#1706)
- examples: re-run skew adjoint notebook @jkwashbourne (#1710)
- examples: Fix source illumination @rabreucristo (#1707)
- examples: Fix broken gempy import in tutorial notebook @EdCaunt (#1704)
- examples: Add notebook for creating seismic synthetics with gempy @EdCaunt (#1643)
- examples: Tweak LSRTM_acoustic notebook @rabreucristo (#1698)
- examples: Add LSRTM acoustic notebook @rabreucristo (#1574)
Compiler
- compiler: Revamp CIRE exploiting EvalDerivative @FabioLuporini (#1688)
- compiler: Improve derivative factorization pass @FabioLuporini (#1657)
MPI
- mpi: Towards compatibility for mode!=basic and GPU @FabioLuporini (#1721)
- mpi: Patch MPI cleanup @FabioLuporini (#1712)
GPU
- gpu: Enable tile clause in place of collapse with OpenACC @FabioLuporini (#1703)
- gpu: HPC SDK 21.5, Singularity, HPCX MPI update @kenhester (#1709)
- gpu: Set device on pthreads @FabioLuporini (#1716)
- compiler: Revamp data streaming @FabioLuporini (#1702)
- gpu: Fix offloading when zero-size arrays @FabioLuporini (#1684)
- gpu: Patch selection of streamed TimeFunctions @FabioLuporini (#1683)
- gpu: Enable tile clause in place of collapse with OpenACC @FabioLuporini (#1703)
🐛 Bug Fixes
- compiler: Revamp data streaming @FabioLuporini (#1702)
- API: fixed callback of PrecomputedInterpolator injection @ccuetom (#1691)
Benchmarking
- bench: Patch jacobian_adjoint @FabioLuporini (#1718)
- bench: prevent zero input for jacobian @mloubout (#1713)
- bench: Fix gflopss dumping @FabioLuporini (#1705)
- bench: Allow asv workflow to discover new benchmarks @rhodrin (#1689)
- bench: Fix asv new benchmark issue @rhodrin (#1686)
Testing
- tests: Patch expected output for ipython==7.23.0 @FabioLuporini (#1687)
- tests: Add optimizations to linalg examples @georgebisbas (#1529)
Continuous Integration
- ci: Stop including tutorial notebooks in codecov @EdCaunt (#1694)
- Allow asv workflow to discover new benchmarks @rhodrin (#1689)
- bench: Fix asv new benchmark issue @rhodrin (#1686)
- Update release-drafter.yml @FabioLuporini (#1685)
Misc
- misc: Fix clear-cache script @FabioLuporini (#1679)
Installation
- pip prod(deps): update distributed requirement from <2021.7 to <2021.8 @dependabot (#1722)
- pip prod(deps): update distributed requirement from <2021.6 to <2021.7 @dependabot (#1701)
- pip prod(deps): update distributed requirement from <2021.5 to <2021.6 @dependabot (#1693)
v4.4
Changes
🐛 Bug Fixes
- compiler: Patch TempFunction pickling @FabioLuporini (#1677)
- bench: Patch ASV's generation of new plots @FabioLuporini (#1672)
- gpu: Fix leaks due to excessive fetching/prefetching @FabioLuporini (#1658)
- types: Drop unsafe/unnecessary memoization in the arguments processing engine @FabioLuporini (#1647)
- compiler: Fix processing of grid spacing @FabioLuporini (#1628)
- misc: Avoid crashing on missing _memfree_args @FabioLuporini (#1612)
- compiler: Fix ScheduleTree construction in presence of guards and/or syncs @FabioLuporini (#1611)
- misc: Patch ThreadID pickling @FabioLuporini (#1606)
- compiler: Patch issue #1592 (HaloScheme with time subdimensions) @Leitevmd (#1597)
- dsl: Patch symbolic coefficients with staggered grids @EdCaunt (#1595)
- Fix find_library issue to MacOS Big Sur @speglich (#1584)
- Fix hierarchical blocking + parallelism @FabioLuporini (#1580)
- BoundSymbol constructor to be cached @mloubout (#1576)
Compiler
- compiler: Tweak nested-par candidate condition @georgebisbas (#1669)
- compiler: Singletonize special symbols (e.g. nthreads) @FabioLuporini (#1650)
- misc: Drop unused backend infrastructure @FabioLuporini (#1632)
- compiler: Improve aliases detection, processing, and optimization @FabioLuporini (#1631)
- mpi: Add diag2 mode @FabioLuporini (#1630)
- compiler: Add skewing pass towards Temporal Blocking @georgebisbas (#1620)
- misc: Use pickled soname rather than generating a new one @FabioLuporini (#1605)
- compiler: Add option to use Functions, in place of Arrays, for compiler-generated temporary @FabioLuporini (#1591)
- compiler: Improve the cost model used by CIRE @FabioLuporini (#1585)
- Refactor Array sharing @FabioLuporini (#1583)
- Refactor Operator hierarchy @FabioLuporini (#1573)
- Introduce devito/arch @FabioLuporini (#1563)
- operator: Use FD-Gpts/s instead of Gpts/s @georgebisbas (#1544)
API
- sympy: Support v1.8 @mloubout (#1549)
- compiler: Check consistency between shape and grid for TimeFunction, too @tjb900 (#1667)
- dsl: Add MatrixSparseTimeFunction to support multi-point sources @FabioLuporini (#1603)
- dsl: Enable overriding over SubDimension thickness @FabioLuporini (#1608)
- Shift argument as a tuple @Leitevmd (#1561)
- BoundSymbol constructor to be cached @mloubout (#1576)
- gpu: Add
devicerm
API for conditional deletions @FabioLuporini (#1571) - Introduce
deviceid
API to offload Operators on specific GPUs @FabioLuporini (#1569)
Examples
- advisor: merge roofline and json @georgebisbas (#1649)
- examples: TTI 1st order operators @ofmla (#1602)
- examples: Add viscoacoustic Born operator to 2nd sls equation @nogueirapeterson (#1617)
- examples: Born approximation for TTI media @ofmla (#1555)
- examples: add first order adjoint viscoacoustic equations @nogueirapeterson (#1567)
- Tutorials: add shift parameter to Ren visco-acoustic equation @nogueirapeterson (#1562)
Documentation
- Bump static release number @FabioLuporini (#1564)
MPI
- compiler: Check consistency between shape and grid for TimeFunction, too @tjb900 (#1667)
- mpi: Add diag2 mode @FabioLuporini (#1630)
- mpi: Add ability to specify MPI topology used for grid division @FabioLuporini (#1604)
- mpi: Prevent double finalization with devito used as a lib @FabioLuporini (#1609)
- mpi: Pass MPI_COMM_SELF to Distributor upon unpickling @FabioLuporini (#1607)
GPU
- gpu: Fixup prefetch jitting when using extra symbols @FabioLuporini (#1678)
- gpu: Update Dockerfile.nvidia to HPC SDK 21.3 @kenhester (#1659)
- gpu: Fix leaks due to excessive fetching/prefetching @FabioLuporini (#1658)
- gpu: Add gpu-fit value for all functions (fix #1642) @Leitevmd (#1645)
- compiler: Work around clang[10,11,?] omp-offloading bug @FabioLuporini (#1634)
- arch: Review get_gpu_info @Leitevmd (#1626)
- misc: Updating to NVIDIA HPC SDK 21.2 @kenhester (#1594)
- compiler: Target gpu for PGI openacc @Leitevmd (#1587)
- arch: Add nvidia-smi parser to GPU checking @Leitevmd (#1615)
- gpu: Updated Dockerfile.nvidia to HPC SDK 21.1 @kenhester (#1593)
- gpu: Add
devicerm
API for conditional deletions @FabioLuporini (#1571) - Data streaming support with OpenMP offloading @FabioLuporini (#1556)
Testing
- ci: Modify mpi-example workflow and update docker actions @rhodrin (#1664)
- ci: Switch to default gcc(9.3) for conda build @mloubout (#1651)
- ci: Fix conda build @mloubout (#1648)
- ci: Update Ci-gpu for nvidia openmp @Leitevmd (#1635)
- ci: Work around archives missing error on apt install @FabioLuporini (#1623)
- ci: Transfer gpu workflow to self-hosted runners @rhodrin (#1618)
- ci: Separate adjoint based tests @rhodrin (#1572)
v4.3
🐛 Bug Fixes
- Patch aliases' min-storage option @FabioLuporini (#1535)
- Fix loop collapsing @FabioLuporini (#1534)
- operator: Patch grid-carried argument overrides @FabioLuporini (#1523)
- Fix profiling IET for Intel Advisor benchmarking @georgebisbas (#1516)
- Fix ARM issues @georgebisbas (#1515)
- make test tools less random @ggorman (#1507)
- ir: Fixup _lower_stepping_dims @Leitevmd (#1504)
- Fix custom coefficients on staggered grids @EdCaunt (#1497)
- Tensor grid fixup @mloubout (#1489)
- Patch issue #1477 @FabioLuporini (#1488)
- Patch compilation time @FabioLuporini (#1487)
- Conditonals: Added diff2sympy to conditionals @EdCaunt (#1475)
- Minor tweaks @FabioLuporini (#1469)
- Fix SubDomainSet bug @rhodrin (#1457)
- Function globabl size method @rhodrin (#1499)
- Fix parallel increments @FabioLuporini (#1446)
- fix misspelling of openmp-targets for amdgcn-amd-amdhsa @paklui (#1429)
- examples: Fix
opt
arg-passing for elastic and TTI @georgebisbas (#1425)
API
- FD: add missing shift option @Leitevmd (#1551)
- Refactor and enhance section profiling @FabioLuporini (#1547)
- GPU data streaming @FabioLuporini (#1520)
- builtins: initialize FD halo @mloubout (#1511)
- Fix custom coefficients on staggered grids @EdCaunt (#1497)
- Tensor grid fixup @mloubout (#1489)
- Added origin_map attribute to Grid @EdCaunt (#1461)
- Tensor API: improve behavior and interleaving with sympy. @mloubout (#1390)
Examples
- Example: add second-order viscoacoustic equations @nogueirapeterson (#1406)
- Updated cavity flow notebook for cfd examples @kmn319 (#1502)
- Notebooks with Damping, PML and HABC absorbing boudary conditions. @felipeaugustogudes (#1323)
- Examples: Add tests and damp safety checks for backward compatibility @mloubout (#1434)
- Move time block notebook to seismic/tutorials @jkwashbourne (#1441)
- examples: Fix
opt
arg-passing for elastic and TTI @georgebisbas (#1425) - Time Blocking Notebook POC w/serialization and compression @jkwashbourne (#1424)
- examples: Patch plot_velocity @FabioLuporini (#1428)
- Self-adjoint seismic: Update correctness linearization plot @jkwashbourne (#1423)
Documentation
- Correct typo in examples/cfd/03_diffusion.ipynb @jakubbober (#1451)
- Replaced mini-web-app for slack invites to slacks native solution. @ggorman (#1439)
MPI
- builtins: initialize FD halo @mloubout (#1511)
- Fix halo update hoisting @Leitevmd (#1494)
- Fixup _hoist_halospots @FabioLuporini (#1496)
- Fixup _drop_halospots @FabioLuporini (#1495)
- Outhalo size warning @rhodrin (#1493)
- Fixing halo spot merge rule (issue 1459) @Leitevmd (#1482)
- MPI + OpenMP 5.0 @italoaug (#1363)
- mpi data gatherer @rhodrin (#1376)
GPU
- Skip viscoacoustic tests with openacc @rhodrin (#1559)
- GPU data streaming @FabioLuporini (#1520)
- Use omp_set_default_device for OpenMP GPU offloading @italoaug (#1517)
- Tidy gpu CI workflow @rhodrin (#1470)
- Fix MPI in Dockerfile.nvidia @kenhester (#1492)
Testing
- Skip viscoacoustic tests with openacc @rhodrin (#1559)
- Tweak ASV and GPU workflows @rhodrin (#1558)
- docker publish tweaks @rhodrin (#1548)
- Docker workflow updates @rhodrin (#1546)
- CI: Update gpu workflow @rhodrin (#1542)
- Migrate asv workflow to new runner @rhodrin (#1530)
- Move mpi examples to new runner @rhodrin (#1543)
- Migrate MPI CI @rhodrin (#1531)
- Add test that compares gradient equivalence @navjotk (#1519)
- ci: Disable dask tutos on OSX build @FabioLuporini (#1505)
- CI: skip dask on osc builds @mloubout (#1458)
Misc
- Update distributed requirement to <2021.2 @dependabot (#1557)
- Improve bitwise reproducibility @ggorman (#1438)
- Update to NVidia HPC 20.9; MPI Fixes @kenhester (#1526)
- perf: Add global gflopss to operator log @georgebisbas (#1462)
- Advisor miscellaneous improvements @georgebisbas (#1503)
- Updated dependencies in Dockerfile.nvidia: HPC SDK 20.9, cupy110 @kenhester (#1484)
- Improve detection and scheduling of CIREs @FabioLuporini (#1465)
- Intel Advisor jupyter notebook benchmarks/user/advisor scripts update @jack-lascala (#1456)
- Added sniffing Intel MPI as MPI version in compiler.py @jack-lascala (#1455)
- Moving and packaging the Intel Advisor 2020 scripts into benchmarks/user/advisor @jack-lascala (#1452)
- Moved from PGI to HPC SDK. New NVIDIA Based-Container. @kenhester (#1450)
- Intel Advisor 2020 automatic roofline tools @jack-lascala (#1440)
- Potentially smarter CIRE via cire-rotate option @FabioLuporini (#1430)
- Update pytest requirement from to >=3.6,<7.0 @dependabot (#1414)
v4.2.3
Synopsis
- Performance optimizations in the symbolic layer and generated code for x86, GPU and MPI.
- Various minor correctness and performance bug fixes.
- Improvements to application developer API.
- Added new tutorial notebooks.
- Increased test coverage - particularly for MPI and GPU's.
Backwards compatibility breaks and deprecations
None
Changes
- Restrict pytest to < 6.0 @rhodrin (#1411)
- pip prod(deps): update distributed requirement from <2.20 to <2.22 @dependabot (#1402)
- MPI minor bug fixes @rhodrin (#1408)
- mac notebook fix @rhodrin (#1407)
- Run docker test @mloubout (#1398)
- docker: fix typos @mloubout (#1397)
- compiler: version checks avoid unreliable exception @dbowman-ion (#1386)
- Fix cross-loop blocking with imperfect nests @FabioLuporini (#1381)
- Cancel previous CI at new commit @mloubout (#1378)
- Heapification of temporaries @FabioLuporini (#1349)
- Fix missing halo exchange over invariants fields @FabioLuporini (#1359)
- tutorials: 00_index fix broken link @georgebisbas (#1356)
- CI: add autolog of PRs @mloubout (#1338)
🐛 Bug Fixes
- Fix issue #1298 @FabioLuporini (#1421)
- ir: Fix lowering of ConditionalDimension @FabioLuporini (#1419)
- mpi: Patch compute_loc_indices + test @FabioLuporini (#1416)
- Fix estimate_cost @FabioLuporini (#1405)
- Fix issue 1332 (fablup version) @FabioLuporini (#1389)
- checkpointing: set right size_ckp @ofmla (#1361)
Data
tweaks @rhodrin (#1369)- Fix min storage @FabioLuporini (#1372)
- Avoid dererencing dangling pointers on GC'd objects @tjb900 (#1316)
- Replace subdimension in expressions @mloubout (#1345)
- Indexify with non-symbolic numeric indices @mloubout (#1343)
- Fix numeric indices @mloubout (#1341)
API
- Buffer: remove unused from args to allow multi buffers @mloubout (#1420)
- operator: Expose more opt knobs to the user level @FabioLuporini (#1387)
- Function/Expr shift utility @mloubout (#1377)
- Prevent sympy cache blowup @mloubout (#1366)
- Symbolic speedup @mloubout (#1362)
Examples
- examples: fix issue #1393 @ofmla (#1400)
- tutorials: add missing ignore tag in dask tutorial @mloubout (#1403)
- Dask tutorial with operator created only once @ofmla (#945)
- tests: Fix args and test viscoacoustic @georgebisbas (#1365)
- Add norm asserts for ssa iso notebooks @jkwashbourne (#1373)
- Make dt a solve property to use latest model parameters @mloubout (#1371)
- examples: Drop unnecessary NBVALs @FabioLuporini (#1367)
- Refine CFL condition in seismic @mloubout (#1348)
- Freesurface with subdomain (iso-acoustic) @mloubout (#1344)
- Improve solve speed for complicated equations @mloubout (#1342)
- elastic-fix @mloubout (#1339)
Documentation
- tutos: Inception of performance modes notebook @georgebisbas (#1358)
- Fix issue 1185 @dabiged (#1360)
ConditionalDimension
s notebook @georgebisbas (#1269)- Add binder directory with requirements.txt including matplotlib @jkwashbourne (#1351)
MPI
- ci: Test all critical seismic examples with MPI @FabioLuporini (#1418)
- Single precision interpolation @mloubout (#1413)
- Update halo warning @rhodrin (#1370)
- Fix MPI benchmark json results @jaimesouza (#1295)
- Improve detection of redundant halo exchanges @FabioLuporini (#1352)
GPU
- Add MPI+GPU runs to CI @georgebisbas (#1251)
- Tweak CIRE for GPUs @FabioLuporini (#1409)
- Add docker publish @mloubout (#1388)
- Nvidia dockerfile @FabioLuporini (#1392)
- Tool get_gpu_info (using lshw, lspci) with test @jack-lascala (#1383)
Contributors
Many thanks to all the contributors to this release (last surname alphabetical order):
- George Bisbas (Imperial College London)
- David Bowman (ION)
- Tim Burgess (DUG)
- Jaime Freire de Souza
- Chris Dinneen
- Ken Hester (NVidia)
- Navjot Kukreja (Imperial College London)
- Giacomo La Scala
- Mathias Louboutin (Georgia Institute of Technology)
- Fabio Luporini (Devito Codes)
- Oscar Mojica (SENAI CIMATEC)
- Rhodri Nelson (Imperial College London)
- John Washbourne (Chevron)
Devito-v4.2.2
Compiler:
- Improve CIRE
- Conditionals improvement
- Improve aliases detection
Misc:
- Updated minimum SymPy version requirement
- Refreshed docs
- Various bug fixes.
- Added more tests for CI and performance regression.
Benchmark:
- Improved asv
- Added support for adjoint/jacobian/jacobian_adjoint
- Improved JIT support for AMD, added AOMP compiler
Example:
- New Skew Self Adjoint operator
- Homogenize seismic examples and Model.
Many thanks to all the contributors to this release (last surname alphabetical order):
George Bisbas (Imperial College London)
Gerard Gorman (Imperial College London)
Mathias Louboutin (Georgia Institute of Technology)
Fabio Luporini (Devito Codes)
Oscar Mojica (SENAI CIMATEC)
Rhodri Nelson (Imperial College London)
Peterson Nogueira (SENAI CIMATEC)
João Henrique Speglich (SENAI CIMATEC)
Lauê Rami Souza Costa de Jesus (SENAI CIMATEC)
John Washbourne (Chevron)