Hephflow

This repository contains Hephflow, a moment-based implementation of the Lattice Boltzmann Method (LBM) for GPU acceleration using CUDA.
Hephflow is the successor to the previous project MR-LBM, and builds upon its concepts and codebase, introducing new features and improvements.

Project Overview

Hephflow implements the moment-based Lattice Boltzmann method, where the collision is carried out across moments from the 0th to the 2nd order.
This project is intended primarily as a proof of concept, with many features still under development.

Hephflow is heavily based on the previous version VISCOPLASTIC-LBM, and therefore shares similar licensing terms and design philosophy.

Installation Guide

Requirements

To compile and run the project, you will need:

C++ compiler (e.g., MSVC)
Nvidia GPU drivers
CUDA toolkit

The CUDA toolkit, provided by Nvidia, includes both the GPU drivers and the necessary CUDA libraries. Ensure that your GPU supports compute capability 3.5 or higher. Note that the software is designed to run on a single GPU, and multi-GPU setups are not currently supported.

Compilation

To compile the project, you can use the provided bash script. The script includes instructions on how to modify it based on your GPU’s compute capability and the necessary arguments.

Installation Steps

Install the CUDA toolkit from Nvidia's website.
Modify the compile.sh file if necessary for your GPU when on Windows. compile_linux.sh for linux machines
Run the script to compile the project.

Simulation

Output Files

The program generates binary output files containing macroscopic quantities such as density and velocity, as well as an information file that describes the simulation parameters (lattice size, tau, velocity set, etc.).

To convert these binary files into a format that can be interpreted by other software (e.g., Paraview), a Python post-processing tool is provided.

Post Processing

To process the output binary files, you will need Python 3 and the following dependencies:

glob
numpy
os
pyevtk
matplotlib To convert the binary files to .vtr files (used in Paraview), use the following command:

python exampleVtk.py "PATH_FILES/ID_SIM"

File Structure

No	File Name	Details
1	main	Main application code
2	mlbm	Core kernel with streaming-collision operations
3	var	Simulation parameters
4	compile.sh	Compilation script (edit for correct CUDA version)
5	definitions.h	Constants used within LBM
6	arrayIndex.h	Index calculation for moments
7	auxFunctions.h	Auxiliary functions
8	cases	Folder for different simulation cases
9	cases/bc_definition	Boundary condition definitions for lattices
10	cases/bc_initialization	Boundary condition flags
11	cases/constants	Constants specific to the case
12	cases/flow_initialization	Flow initialization parameters
13	cases/model	Model parameters (e.g., velocity set, collision model)
14	cases/output	Data export parameters
15	colrec	Collision and reconstruction files for moments
16	includeFiles/popSave	Load population from global memory
17	includeFiles/popLoad	Save population to global memory
18	includeFiles/interface	Definition of boundary conditions (wall or periodic)
19	checkpoint	Functions for generating simulation checkpoints
20	errorDef	Error handling functions
21	globalFunctions	Index functions
22	globalStructs	Structures for device and host
23	lbmInitialization	Field initialization functions
24	nnf	Non-Newtonian fluid definitions
25	nodeTypeMap	Boundary condition node type map
26	particleTracer	Functions for particle tracing
27	reduction	Parallel reduction functions
28	saveData	Functions to save simulation data

Creating a Boundary Case

To define a new boundary case, you need to create files within the cases folder. Each case should contain the following files:

bc_definition: Defines the mathematical equations for the moments (0th to 2nd order).
bc_initialization: Defines the boundary condition flags.
constants: Defines simulation parameters (mesh size, velocity, etc.).
flow_initialization: Defines how the flow is initialized.
model: Defines model parameters (e.g., velocity set, collision model).
output: Defines output parameters for data export.

Additional models like thermal or viscoelastic will require additional files.

Using Voxels Immersed Bodies

To use voxels for immersed bodies, follow these steps:

Create a CSV file containing the coordinates of the solid nodes.
Add an include directive for VOXEL_FILENAME in the constants file.
Include VOXEL_BC_DEFINE in the bc_definition.

Benchmark

The following table summarizes performance benchmarks on various GPUs:

GPu	sm	GPU Clock (GHz)	Memory Clock (GHz)	Block Size	MLUPs	Observations
RTX 4090 OC	89	3.0 GHz	1.5 GHz	8x8x8	9075
RTX 4090	89	2.8 GHz	1.3 GHz	8x8x8	7899
RTX 4060	89	2.8 GHz	2.1 GHz	8x8x8	2167
RTX 4060	89	2.8 GHz	2.1 GHz	8x8x4	1932
---------------	----	-----------------	--------------------	-----------------	-----------	---------------------------------------
RTX 3060 OC	86	2.0 GHz	2.0 GHz	8x8x8	3083
RTX 3060	86	1.8 GHz	1.8 GHz	8x8x8	2755
---------------	----	-----------------	--------------------	-----------------	-----------	---------------------------------------
A100	80	1.3 GHz	1.5 GHz	16x8x8 (D)	10243
A100	80	1.3 GHz	1.5 GHz	8x8x8	11390
---------------	----	-----------------	--------------------	-----------------	-----------	---------------------------------------
RTX 2060	75	1.9 GHz	1.7 GHz	8x8x8	2397
GTX 1660	75	1.9 GHz	2.0 GHz	8x8x8	1323
GTX 1660	75	1.9 GHz	2.0 GHz	8x8x4	1248
GTX 1660	75	1.9 GHz	2.0 GHz	16x4x4	1213
T600	75	1.5 GHz	1.25 GHz	8x8x8	881	Limited by GPU, Memory controler 84%
---------------	----	-----------------	--------------------	-----------------	-----------	---------------------------------------
K20x	35	0.7 GHz	1.3 GHz	8x8x4	730	Limited by GPU, Memory controler 47%
K20x	35	0.7 GHz	1.3 GHz	8x8x8	670	Limited by GPU, Memory controler 40%
K80	35	0.8 GHz	1.2 GHz	8x8x8	1142	Using 1 Chip
---------------	----	-----------------	--------------------	-----------------	-----------	---------------------------------------

Currently tested speed using benchmark case FP32, differences may occur due to frequencies and thermal throttle.

Note:

D refers to dynamic allocation of shared memory, which is required to increase the maximum shared memory per block to 48 KB.

Gallery

In progress

Publications

Previous Version: MR-LBM

Hephflow is a direct evolution of the MR-LBM project.
If you are looking for the original MR-LBM implementation, see MR-LBM on GitHub.

TODO

The TODO.todo file contains list of other objectives that may or may not be implemented. Either contain items to improve performance or enhance the capabilities of the model.

License

This software is provided under the GPLv2 license.

Contact

For bug reports or issues, please use the GitHub issue tracker for better visibility. You can also contact the maintainers via email at:

e.marcoferrari@utfpr.edu.br

Name		Name	Last commit message	Last commit date
Latest commit History 473 Commits
.github		.github
Images		Images
MatlabSupportFiles		MatlabSupportFiles
aSupportFiles		aSupportFiles
bin		bin
post		post
src		src
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md
SECURITY.md		SECURITY.md
TODO.todo		TODO.todo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hephflow

Table of Contents

Project Overview

Installation Guide

Requirements

Compilation

Installation Steps

Simulation

Output Files

Post Processing

File Structure

Creating a Boundary Case

Using Voxels Immersed Bodies

Benchmark

Note:

Gallery

Publications

Previous Version: MR-LBM

TODO

License

Contact

About

Uh oh!

Releases 1

Packages

Languages

License

CERNN/Hephflow

Folders and files

Latest commit

History

Repository files navigation

Hephflow

Table of Contents

Project Overview

Installation Guide

Requirements

Compilation

Installation Steps

Simulation

Output Files

Post Processing

File Structure

Creating a Boundary Case

Using Voxels Immersed Bodies

Benchmark

Note:

Gallery

Publications

Previous Version: MR-LBM

TODO

License

Contact

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages