Skip to content

add short inputs optimizations #925

@lemire

Description

@lemire

When processing short inputs (e.g., strings smaller than 16 bytes), calling our fast functions is wasted effort. Now that @pauldreik has done the hard work of moving our scalar (=naive) implementations in header files, what becomes possible is to add short string optimizations: when the input is sufficiently small, prefer the naive (but easy to inline) routine.

It can be an actual issue. Recently @anonrig tried to enable simdutf in ada, within Node, and he reported a negative effect on performance. Though I have not examined the use case, the likely cause in my view is that he is replacing a trivial (small, pure C function) with a non-trivial function call into simdutf.

You can concrete examples in Node...

https://github.com/nodejs/node/blob/e9b0849606ccaa607698ccb2e88f7245113968c3/src/node_buffer.cc#L760-L764

https://github.com/nodejs/node/blob/e9b0849606ccaa607698ccb2e88f7245113968c3/src/string_bytes.cc#L542-L559

(The latter example is from @ChALkeR whereas I think that the first one is from me.)

My expectation is not that the actual implementation in simdutf are inefficient for short strings, but rather that replacing a less efficient, but inlinable, function, with a non-inlineable function that is faster (as long as it has enough work to do) can be a slight net negative on short inputs.

With the current code base, it should now be relatively easy to add, directly in the simdutf header file, something like this ...

if(the input is short) {
  // call the simple scalar function
} else {
  // dispatch into the simdutf lib for the optimized function
}

Obviously, this entails adding relevant benchmarks, which we do not have right now.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions