Diacritics are used across many languages in order to change the sound-values of the letters to which they are added. In software development, diacritics often have to be replaced with non-diacritics, e.g. to improve usability of user input. Diacritics.NET is a basic mapper between diacritic characters an non-diacritic characters.
This library is available on NuGet: https://www.nuget.org/packages/Diacritics/ Use the following command to install Diacritics using NuGet package manager console:
PM> Install-Package Diacritics
You can use this library in any .Net project which is compatible to PCL (e.g. Xamarin Android, iOS, Windows Phone, Windows Store, Universal Apps, etc.)
The most common use case of this library is to find and replace diacritic characters in a given string.
RemoveDiacritics
is a string extension method which returns a diacritics-free string.
// Arrange
const string InputString = "Je veux aller à Saint-Étienne";
// Act
string removeDiacritics = InputString.RemoveDiacritics();
// Assert
removeDiacritics.Should().Be("Je veux aller a Saint-Etienne");
The most common use case of this library is to detect and remove diacritic characters from a given string.
If you just want to check whether a string contains diacritics, use the string extensions method HasDiacritics
.
// Arrange
const string InputString = "Je veux aller à Saint-Étienne";
// Act
bool hasDiacritics = InputString.HasDiacritics();
// Assert
hasDiacritics.Should().BeTrue();
The example shown above uses extension methods which use a default implementation of IDiacriticsMapper
, namely type
DefaultDiacriticsMapper
. If you're using an IoC container, you can register IDiacriticsMapper
either with the
provided DefaultDiacriticsMapper
or with your own implementation of IDiacriticsMapper
.
Diacritics is extensible. You can write your own language accent by implementing IAccentMapping
(or AccentMapping
base class). DiacriticsMapper
accepts any IAccentMapping
type at construction time.
You are highly welcome to contribute to this library. Just create a fork, commit your changes and create a pull request.
TODO: Add/Remove methods for adding/removing accents at runtime.
You can pass DiacriticsOptions
into methods like RemoveDiacritics
or HasDiacritics
. Following properties can be
set:
Property | Description |
---|---|
Decompose |
IAccentMapping provides a property IDictionary<char, MappingReplacement> Mapping which defines the actual mappings between a character and a MappingReplacement . The MappingReplacement can be setup with the properties Base , Decompose and DecomposeTitle . In most languages, there is no need to set Decompose and DecomposeTitle properties. Decomposition is needed e.g. in German language to decompose the eszett character ß to ss . If the option is set to Decompose = true , the diacritics mapper uses the Decompose and DecomposeTitle values instead of the Base value for diacritics replacements. |
BenchmarkDotNet v0.15.2, macOS Sequoia 15.5 (24F74) [Darwin 24.5.0]
Apple M3 Max, 1 CPU, 14 logical and 14 physical cores
.NET SDK 9.0.301
[Host] : .NET 9.0.6 (9.0.625.26613), Arm64 RyuJIT AdvSIMD
ShortRun : .NET 9.0.6 (9.0.625.26613), Arm64 RyuJIT AdvSIMD
Job=ShortRun IterationCount=3 LaunchCount=1
WarmupCount=3
Method | Mean | Error | StdDev | Rank | Gen0 | Gen1 | Gen2 | Allocated |
---|---|---|---|---|---|---|---|---|
RemoveDiacritics_100kWords | 404.2 us | 114.19 us | 6.26 us | 1 | 83.0078 | 83.0078 | 83.0078 | 265.13 KB |
RemoveDiacritics_1mWords | 403.7 us | 35.59 us | 1.95 us | 1 | 83.0078 | 83.0078 | 83.0078 | 264.46 KB |
Legend
- Mean : Arithmetic mean of all measurements.
- Error : Half of 99.9% confidence interval.
- StdDev : Standard deviation of all measurements.
- Rank : Relative position of current benchmark mean among all benchmarks (Arabic style).
- 1 ns : 1 Nanosecond (0.000000001 sec).
This project is offered under a dual license:
- Free for non-commercial use, including private and educational purposes.
- Commercial use requires a license, please support the project by making a one-time donation of $50+ via https://buymeacoffee.com/thomasgalliker.
Recurring or more generous sponsorships are sincerely appreciated and help sustain ongoing development. Thank you for your support! If you have any questions about licensing, feel free to contact Thomas Galliker.