"It is easy to get it work but difficult to do it right."
The root of doing the right thing is completely relied on the fundamentals - knowledge of Software Engineering. This is a big topic that I would continue to write articles on it.
ES6 has introduced functional operators(
When I started to build a new project called kt.ts that targeting to bring a wise designed API interface from Kotlin to Typescript. I started with a wrapper approach that use lodash internally to build the kt.ts abstractions. After few hours of implementation and design the project architecture, few commonly used interfaces are finished like
ramda for example,
R.pipe( R.map(transform), R.filter(isNotNullPredicate), R.uniq, R.map(calculation), R.sum, )(array)
In Kotlin, we could simply express in this away:
array .mapNotNull(transform) .distinct() .sumBy(calculation)
By then, I kick-started a project to benchmark my wrapped implementations. The result was unexpectedly slow compared with the vanilla
.filter approach and completely unacceptable.
The first thoughts came to my head was
Did I do something wrong? Is that V8 optimization too magical to me?
OMFG, there is a 80x performance difference!.
To solve this problem,
my sense of software engineering summon me to implement all possible versions of
The result is stunning that I had to group all my implementations into a single benchmark and design a scenario that would use
.distinct to see if the difference would be deteriorating.
A Simple Benchmark
I implemented the same behavior in 20 variations using
- lazy sequence
The ramda and lodash/fp's benchmark results are surprisingly worse.
You could check the dedicated implementations at https://github.com/gaplo917/js-benchmark.
Accurate Benchmark on a Bare Metal in Packet
In order to get an accurate benchmark, we need to make sure the benchmark environment has no significant workload, no performance thermal throttling, and no affection with a noisy neighbor (if VM). That's why I have rented a packet bare metal machine (t1.small.x86) in NTR to perform the benchmark with a minimum of 1000 samples for each benchmark.
They have to learn the necessary algorithms and data structures and write them in an imperative (control flow) style to get the best performance instead.
Otherwise, under a single node process, you need to sacrifice some performance to trade readability and development speed. That is the "Declarative Cost".
If someone told you a single
.map operator in lodash or ramda is just 20% slower than the native function operator, they really forgot about we normally chain 3 - 5 operators on average. That is
0.8^3=0.51 to 0.8^5=0.32, nearly 50-70% performance drop on average of reasonable array sizes.
The attached PDF files contain a full benchmark result from array size 1,10,100,1k,10k,100k,1M.