Skip to content

Optimize removeDuplicateGradient to avoid O(n²) in the average case

Created by: nthykier

The original implementation of removeDuplicateGradient does O(n²) search over all gradients to remove duplicates. In images with many gradients (such as MediaWiki_logo_1.svg), this becomes a significant overhead.

This patch optimizes for the average case by splitting gradients into smaller lists (called "buckets" in the code). The splitting is done by selecting some attributes to generate a key with the following properties:

  • If multiple gradients have the same key, then a subset of those gradients /might/ be duplicates of each other.
  • If their keys are not identical, then they cannot be duplicates.

Note that in worst case, we will still hit O(n²) and it is easily possible to construct svg files that deliberately triggers the O(n²) runtime.

With that caveat aside, this improves the runtime performance on MediaWiki_logo_1.svg by about 25% (8m51s -> 6m40s on 5 runs).

Original: $ time for I in $(seq 1 5) ; do
python3 -m scour.scour MediaWiki_logo_1.svg out.svg ;
done Scour processed file "heavy.svg" in 105042 ms: 1582746/4989544 bytes new/orig -> 31.7% Scour processed file "heavy.svg" in 103412 ms: 1582746/4989544 bytes new/orig -> 31.7% Scour processed file "heavy.svg" in 105334 ms: 1582746/4989544 bytes new/orig -> 31.7% Scour processed file "heavy.svg" in 107902 ms: 1582746/4989544 bytes new/orig -> 31.7% Scour processed file "heavy.svg" in 108161 ms: 1582746/4989544 bytes new/orig -> 31.7%

8m51.855s ...

Optimized: Scour processed file "heavy.svg" in 78162 ms: 1582746/4989544 bytes new/orig -> 31.7% Scour processed file "heavy.svg" in 81202 ms: 1582746/4989544 bytes new/orig -> 31.7% Scour processed file "heavy.svg" in 81554 ms: 1582746/4989544 bytes new/orig -> 31.7% Scour processed file "heavy.svg" in 80067 ms: 1582746/4989544 bytes new/orig -> 31.7% Scour processed file "heavy.svg" in 77267 ms: 1582746/4989544 bytes new/orig -> 31.7%

Signed-off-by: Niels Thykier niels@thykier.net

Merge request reports

Loading