Skip to content

add simstore table access via slice#1050

Merged
dwhswenson merged 6 commits into
openpathsampling:masterfrom
sroet:implement_slicing
Aug 27, 2021
Merged

add simstore table access via slice#1050
dwhswenson merged 6 commits into
openpathsampling:masterfrom
sroet:implement_slicing

Conversation

@sroet

@sroet sroet commented Aug 20, 2021

Copy link
Copy Markdown
Member

Today, I got annoyed today that I could not do
steps = storage.steps[:100] in some test code using simstore.

This adds handling of slices in the __getitem__ logic.
It behaves as a list would do (so an empty list on an empty slice, a list with a single item for slices like [4:5], a single item with int and a TypeError if it is not an int or slice)

It should also leverage batched loading, if I understood the code correctly

One assumption that I added is that the name of objects can only be strings (for _name_to_uuid), I don't know if this is currently an assumption/requirement, but all tests seem to pass on this.

This code is python > .3.6 (f strings) and the tests are python > 3.7 (leverages dict order)

@sroet sroet changed the title add access via slice add simstore table access via slice Aug 20, 2021
@sroet sroet requested a review from dwhswenson August 20, 2021 15:13
Comment thread openpathsampling/experimental/simstore/storage.py Outdated
@codecov

codecov Bot commented Aug 20, 2021

Copy link
Copy Markdown

Codecov Report

Merging #1050 (5bc8ec0) into master (d248898) will decrease coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1050      +/-   ##
==========================================
- Coverage   81.57%   81.57%   -0.01%     
==========================================
  Files         140      140              
  Lines       15421    15421              
==========================================
- Hits        12580    12579       -1     
- Misses       2841     2842       +1     
Impacted Files Coverage Δ
openpathsampling/netcdfplus/cache.py 63.72% <0.00%> (-0.33%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d248898...5bc8ec0. Read the comment docs.

@dwhswenson dwhswenson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks -- I've also noticed this as a minor annoyance (usually I'm too busy with whatever I'm working on and just use list(storage.steps)[start:end], because hacks like that are obvious to me -- that kind of hack is often NOT obvious to our users, and we shouldn't expect them to know it -- so this is a very useful improvement.)

The only required change is to fix a typo in an error message. Beyond that, I hope you'll explain why one part of the code works (because I don't get it!), and other things are fixes for a mistake I made and some general "let's agree on some code style that isn't officially defined in PEP-8" stuff.

Comment thread openpathsampling/experimental/simstore/storage.py Outdated
Comment thread openpathsampling/experimental/simstore/storage.py Outdated
Comment thread openpathsampling/experimental/simstore/storage.py Outdated
Comment thread openpathsampling/experimental/simstore/storage.py Outdated
Comment thread openpathsampling/experimental/simstore/storage.py Outdated
Comment thread openpathsampling/experimental/simstore/storage.py Outdated
@sroet

sroet commented Aug 23, 2021

Copy link
Copy Markdown
Member Author

Thanks -- I've also noticed this as a minor annoyance (usually I'm too busy with whatever I'm working on and just use list(storage.steps)[start:end], because hacks like that are obvious to me -- that kind of hack is often NOT obvious to our users, and we shouldn't expect them to know it -- so this is a very useful improvement.)

Yeah, that would be a solution if I actually could load all the steps (100_000) on my local machine in a reasonable time, I can't 😆

My hack around was to write something like

stop = 100:
steps = []
for i, e in storage.steps:
    steps.append(e)
    if i > stop:
        break

Which was not great xD

With d1c8609, I actually test the correct code path. I also tested that this works (and hits the code) on the "Simstore and CLI" minitutorial.

@sroet

sroet commented Aug 23, 2021

Copy link
Copy Markdown
Member Author

@dwhswenson please have another look, thanks for catching the wrong code!

@sroet sroet requested a review from dwhswenson August 23, 2021 10:03
@sroet

sroet commented Aug 23, 2021

Copy link
Copy Markdown
Member Author

also, I would like to know if:

One assumption that I added is that the name of objects can only be strings (for _name_to_uuid), I don't know if this is currently an assumption/requirement, but all tests seem to pass on this.

Is a reasonable assumption to enforce

@dwhswenson

Copy link
Copy Markdown
Member

One assumption that I added is that the name of objects can only be strings (for _name_to_uuid), I don't know if this is currently an assumption/requirement, but all tests seem to pass on this.

I think this should be the case, although it looks like SimStore isn't raising an error if you try to save an object with a non-string name. The StorableNamedObject.named method allows non-string names. As long as the name attribute is something that SimStore can turn into a JSON string, it will be saved in SimStore. (netcdfplus does catch this error on saving, because it doesn't treat the name attribute as part of the JSON that is generated.)

PR coming to make this explicitly required.

@dwhswenson dwhswenson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more tiny change for readability (sorry to miss this one before). Otherwise, looks good.

I was debating whether to do isinstance(item, slice) or type(item) is slice: I agree that there's no distinction, since slice is final (can't be subclassed), but I'd also probably look for this code by grepping "isinstance". Anyway, either one is fine.

Comment thread openpathsampling/experimental/simstore/storage.py Outdated
Co-authored-by: David W.H. Swenson <dwhs@hyperblazer.net>
@sroet sroet requested a review from dwhswenson August 23, 2021 14:37

@dwhswenson dwhswenson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Will merge this one now.

@dwhswenson dwhswenson merged commit a35d0c4 into openpathsampling:master Aug 27, 2021
@sroet sroet deleted the implement_slicing branch August 29, 2021 08:18
@dwhswenson dwhswenson mentioned this pull request Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants