Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Optimize IOSource#read_until method
## Why?
The result of `encode(term)` can be cached.

## Benchmark

```
RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/naitoh/.rbenv/versions/3.3.4/bin/ruby -v -S benchmark-driver /Users/naitoh/ghq/github.com/naitoh/rexml/benchmark/parse.yaml
ruby 3.3.4 (2024-07-09 revision be1089c8ec) [arm64-darwin22]
Calculating -------------------------------------
                         before       after  before(YJIT)  after(YJIT)
                 dom     17.546      18.512        32.282       32.306 i/s -     100.000 times in 5.699323s 5.402026s 3.097658s 3.095448s
                 sax     25.435      28.294        47.526       50.074 i/s -     100.000 times in 3.931613s 3.534310s 2.104122s 1.997057s
                pull     29.471      31.870        54.400       57.554 i/s -     100.000 times in 3.393211s 3.137793s 1.838222s 1.737494s
              stream     29.169      31.153        51.613       52.898 i/s -     100.000 times in 3.428318s 3.209941s 1.937508s 1.890424s

Comparison:
                              dom
         after(YJIT):        32.3 i/s
        before(YJIT):        32.3 i/s - 1.00x  slower
               after:        18.5 i/s - 1.75x  slower
              before:        17.5 i/s - 1.84x  slower

                              sax
         after(YJIT):        50.1 i/s
        before(YJIT):        47.5 i/s - 1.05x  slower
               after:        28.3 i/s - 1.77x  slower
              before:        25.4 i/s - 1.97x  slower

                             pull
         after(YJIT):        57.6 i/s
        before(YJIT):        54.4 i/s - 1.06x  slower
               after:        31.9 i/s - 1.81x  slower
              before:        29.5 i/s - 1.95x  slower

                           stream
         after(YJIT):        52.9 i/s
        before(YJIT):        51.6 i/s - 1.02x  slower
               after:        31.2 i/s - 1.70x  slower
              before:        29.2 i/s - 1.81x  slower

```

- YJIT=ON : 1.00x - 1.06x faster
- YJIT=OFF : 1.05x - 1.11x faster
  • Loading branch information
naitoh committed Oct 9, 2024
commit aa3d95427563fbca6b024656fe86516c751e3ca1
3 changes: 2 additions & 1 deletion lib/rexml/source.rb
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ def initialize(arg, encoding=nil)
detect_encoding
end
@line = 0
@term_encord = {}
end

# The current buffer (what we're going to read next)
Expand Down Expand Up @@ -227,7 +228,7 @@ def read(term = nil, min_bytes = 1)

def read_until(term)
pattern = Private::PRE_DEFINED_TERM_PATTERNS[term] || /#{Regexp.escape(term)}/
term = encode(term)
term = @term_encord[term] ||= encode(term)
until str = @scanner.scan_until(pattern)
break if @source.nil?
break if @source.eof?
Expand Down