- Published on
- Reading time
- 8 min read
Functional String Processing: Implement in F#, Call from C#
- Authors

- Name
- Dexter Ajoku
- X
- @dexcompiler
I recently had the task to build a string extracted from some XML file into a very specific format under some constraints. Some of those requirements were:
- System must accept a string containing tagged content in format
/TAGx/content - System must accept a priority map defining the processing order of tags [this was predefined in configuration]
- Tags must be processed in order of their assigned priority, lower number = higher priority
- Maximum content length must be 390 characters
- Content must be divided into segments of exactly 65 characters each
- Maximum of 6 segments allowed (65 × 6 = 390)
- Content can flow across segments without regard to tag boundaries
- Final segment may be shorter than 65 characters
- Each tag and its associated content must be processed in priority order
- If including a tag's content would cause total content length to exceed 390 characters:
- The entire tag and its content must be excluded
- All subsequent tags must also be excluded
- Partial tag content is not allowed if it would cause total length breach
There were more, but these were the hard requirements necessary for further processing.
This seemed like a good opportunity to use a functional paradigm and of course commence a more serious functional learning journey since it was my first foray into F# in a production setting after some hobbyist projects. The received content from the C# part of the application would be in the form of a list of StringBuilder objects, each segment of the list would look like this:
var sb = new StringBuilder
("/TAG1/AAAAAAAAAAAAAAAAAAAA/TAG2/BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB/TAG3/CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC/TAG4/DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD/TAG5/EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE/TAG6/small-complete-tag/TAG7/another-small-tag/TAG8/this-should-not-appear");
Configuration data when materialized into the app would look like:
List<(string, int)> priorityMap =
[
("/TAG1/", 1),
("/TAG2/", 2),
("/TAG3/", 3),
("/TAG4/", 4),
("/TAG5/", 5),
("/TAG6/", 6),
("/TAG7/", 7),
("/TAG8/", 8)
];
Then we enter the string processing logic in F#. First, we define our core types to represent the segments and the result of the processing:
[<Struct>]
type TagSegment = {
StartPos: int
Length: int
Priority: int
Content: string
}
[<Struct>]
type ArrangeResult = {
BuildString: StringBuilder
Segments: TagSegment list
}
The TagSegment record captures each piece of tagged content along with its position, length, and priority. The ArrangeResult wraps both the final StringBuilder and the list of processed segments for convenient consumption from C#.
Now we build our processing pipeline. First, a helper to find where the next tag begins:
let private findNextTagPosition (content: string) (currentPos: int) (tag: string) (tags: string list) =
tags
|> List.choose (fun t ->
let pos = content.IndexOf(t, currentPos + tag.Length)
if pos > currentPos then Some pos else None)
|> function
| [] -> content.Length
| positions -> List.min positions
This function scans ahead from the current position to find the nearest occurrence of any tag. If no tag is found, it returns the end of the string. This lets us know where the current tag's content ends.
Next, we extract the tag segments on a positional basis:
let private findTagSegments (content: string) (tag: string) (priority: int) (allTags: string list) =
let rec loop pos acc =
match content.IndexOf(tag, pos, StringComparison.Ordinal) with
| -1 -> List.rev acc
| currentPos ->
let nextPos = findNextTagPosition content currentPos tag allTags
let segment = {
StartPos = currentPos
Length = nextPos - currentPos
Priority = priority
Content = content.[currentPos..nextPos-1].TrimEnd()
}
loop nextPos (segment :: acc)
loop 0 []
Essentially:
- search for a tag starting at position
pos - if no tag is found [-1 situation] return the accumulated segments in reverse order to preserve the original order
- if a tag is found:
- find the next tag position
- create a new segment with the content between the current and next tag position
- recursively search from the next position for the next tag, adding the new segment to the accumulator
With segment extraction in place, we need to collect segments from all tags and sort them by priority:
let private processSegments (content: string) (priorityMap: (string * int) list) =
let tags = priorityMap |> List.map fst
priorityMap
|> List.collect (fun (tag, priority) ->
findTagSegments content tag priority tags)
|> List.sortBy (fun s -> s.Priority)
This function iterates through the priority map, extracts segments for each tag, flattens them into a single list, and sorts by priority. Lower priority numbers come first as indicated in the priority map, ensuring the most important content gets included.
Next comes the validation step, this is where we enforce the 390-character limit and the "all or nothing" rule for tags:
let private validateSegments (maxLength: int) (segments: TagSegment list) =
let rec loop acc length = function
| [] -> List.rev acc
| segment :: rest ->
let newLength = length + segment.Content.Length
if newLength <= maxLength
then loop (segment :: acc) newLength rest
else List.rev acc
loop [] 0 segments
The validateSegments function walks through the priority-sorted segments, accumulating content length as it goes. The moment adding a segment would exceed maxLength, it stops and returns what it has, no partial inclusions allowed. This elegantly satisfies the requirement that if a tag would cause a breach, it and all subsequent tags are excluded.
Finally, we build the result of the processing inside the buildFinalResult function:
let private buildFinalResult (segments: TagSegment list) =
let result = StringBuilder(maxTotalLength)
let combinedContent =
let sb = StringBuilder()
segments |> List.iter (fun s -> sb.Append(s.Content) |> ignore)
sb.ToString()
let rec splitIntoChunks pos content acc =
match content with
| "" -> List.rev acc
| remaining ->
let chunkSize = Math.Min(maxSetLength, remaining.Length)
if pos + chunkSize > maxTotalLength then
List.rev acc
else
let chunk = remaining.[0..chunkSize-1]
let newSegment = {
StartPos = pos
Length = chunkSize
Priority = 0
Content = chunk
}
result.Append(chunk) |> ignore
splitIntoChunks
(pos + chunkSize)
(remaining.[chunkSize..])
(newSegment :: acc)
let finalSegments = splitIntoChunks 0 combinedContent []
{ BuildString = result; Segments = finalSegments }
The buildFinalResult function essentially takes our processed segments and prepares them for final output while respecting size constraints.
- First we combine all the segment contents into a single string
- Then we split the combined content into chunks of 65 characters each, respecting the size constraint [never exceed 390 characters]
- As it creates these chunks, it builds both a StringBuilder containing the final text and a list of segments that track where each chunk begins and ends. Think of it like taking a long piece of text and carefully dividing it into evenly-sized pages, while keeping track of where each page starts and what content it holds. The function manages this process recursively rather than with traditional loops [which we would use if this were C#], accumulating the chunks one at a time until it either runs out of content or hits the maximum length limit
- Finally, it returns the StringBuilder containing the final text and the list of segments that track where each chunk begins and ends, packaging up both the StringBuilder and the list of segment locations into a single result that can be easily consumed by other parts of the application, particularly from C# call-sites.
The concluding function within the F# core simply pipelines the necessary internal functions to process the StringBuilder and packages the entire code to make it callable from C# as a regular extension method:
let arrangeSegments (sb: StringBuilder) (priorityMap: seq<string * int>) =
if not (isValidInput sb priorityMap) then
{ BuildString = sb; Segments = [] }
else
let content = sb.ToString()
let result =
priorityMap
|> Seq.toList
|> processSegments content
|> validateSegments maxTotalLength
|> buildFinalResult
sb.Clear() |> ignore
sb.Append(result.BuildString) |> ignore
result
module SbExtensions =
type StringBuilder with
member this.ArrangeSegments (priorityMap: seq<string * int>) =
StringBuilderExtension.arrangeSegments this priorityMap
allowing us to call it from C# like this:
var result = builder.ArrangeSegments(priorityMap);
To see this in action, let's trace through our example input. The original string totals well over 390 characters:
| Tag | Content Length | Running Total | Status |
|---|---|---|---|
| TAG1 | 26 (including tag) | 26 | ✅ Included |
| TAG2 | 50 | 76 | ✅ Included |
| TAG3 | 111 | 187 | ✅ Included |
| TAG4 | 62 | 249 | ✅ Included |
| TAG5 | 77 | 326 | ✅ Included |
| TAG6 | 24 | 350 | ✅ Included |
| TAG7 | 24 | 374 | ✅ Included |
| TAG8 | 28 | 402 | ❌ Excluded (would exceed 390) |
TAG8's content "/TAG8/this-should-not-appear" is excluded entirely because including it would push us over the 390-character limit. The validated content is then split into six 65-character chunks (with the last chunk being shorter), and the result is returned with both the StringBuilder and the segment metadata.
This approach allowed me to build a robust and testable string processing pipeline in F# while keeping the core logic isolated and reusable. The extension method approach in C# made it easy to integrate the F# functionality into existing codebases without significant changes to the existing architecture. The functional paradigm allowed me to express the logic in a more declarative way, making it easier to reason about and test. By the way, you might notice some F#-Rust parallels here, and you would be correct.