keep special characters in URL
When creating a page, the default URL segment validation automatically replaces special characters with their standard equivalents (e.g., "ä" is replaced with "a"). However, some clients may require these special characters to remain intact in URLs for non-English versions of their website.
var validChars = "ü ö ä ß ó ñ á á é í ó ő ú ü ñ";
For CMS 12
//Startup.cs
services.Configure<UrlSegmentOptions>(config => {
config.SupportIriCharacters = true;
config.ValidCharacters = @"A-Za-z0-9\-_~\.\$" + validChars;
});
For CMS 11
[InitializableModule]
[ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
public class UrlSegmentConfigurationModule : IConfigurableModule
{
public void ConfigureContainer(ServiceConfigurationContext context)
{
var validChars = "ü ö ä ß ó ñ á á é í ó ő ú ü ñ";
context.Services.RemoveAll<UrlSegmentOptions>();
context.Services.AddSingleton<UrlSegmentOptions>(s => new UrlSegmentOptions
{
SupportIriCharacters = true,
ValidCharacters = @"\p{L}0-9\-_~\.\$" + validChars
});
}
public void Initialize(InitializationEngine context){}
public void Uninitialize(InitializationEngine context) { }
}
References:
- https://support.optimizely.com/hc/en-us/articles/115005062883-Enable-special-characters-in-URL-Segment
- https://world.optimizely.com/blogs/Minesh-Shah/Dates/2023/2/url-rewrites-in-cms12--net-6-/
- An Introduction to Multilingual Web Addresses
Hello Khan,
It's worth noting that the builtin Optimizely behaviour is in alignment with the current RFC 3986 URI (Uniform Resource Identifier) specification. I've had issues before with non-compliant characters within URLs and iterpretation by the browser and documentation platforms with confusion of encoding. In these cases I would personally recommend against this.
Make sense, thanks for sharing.