Revision of JPath library
|Status:||In Progress||Start date:||03/25/2017|
|Assignee:||Jan Mach||% Done:|
|Category:||Development - Core|
Because mentat-inspector.py is currently being heavily used and expectations are getting bigger, it would be wise to perform revision of the library and finish the implementation, so that everything works as expected and mentat-inspector.py delivers.
JPath is simplified version of JSONPath and can be used to addressing nodes within arbitrary data structure composed of dict-like and list-like objects. Basically it can be used for any data structure of objects implementing Python list and/or dict interface.
The motivation for implementing this module were following two use cases:
- Writing of simple rules in filtering expressions, for example:
Source.IP4 in [192.168.0.0/24, 192.168.0.0/24]# Simple message modifications based on the key => value rules, for example:
"Source.Type[*]" = "source type tag"
The obvious first choice as a solution was the jsonpath-rw library. The full JSONPath however seems to be too big of a gun for our needs and in some cases it could even enable users to cut branch they are sitting on. For this reason we have designed this simplified version with only basic features.
Documentation must contain examples for common use cases.
Fix: Finished the implementation of appending to the list.
Previously the appending to the list did not work, it was necessary to update path_set() method to make use of '*' index modifier in JPath. (Redmine issue: #3392)
Implemented jpath_exists() method.
This method simply checks for existence of given JPath element and returns True or False. (Redmine issue: #3392)
Refactoring from parse_path() to jpath_parse() and created custom JPathException class.
Refactored method name and created custom exception class to be thrown instead of generic Exception. (Redmine issue: #3392)
Refactoring of all path_...() methods to jpath_...().
Refactored names of all remaining path_...() methods to jpath_...() to be more clear and consistent. (Redmine issue: #3392)
Added path and match attributes to JPath chunks.
When parsing JPath into chunks, path and match attributes are added to the result to keep the information about preceding nodes within the chunk and enable better debugging output about possible failures. (Redmine issue: #3392)
Bugfix: JPath library was not working with IDEA messages.
The JPath library did not work with IDEA messages, there are no dicts and lists. Isinstance checks had to be updated accordingly. Added assertions for working with IDEA messages into unit test case to prevent this from happening in the future. (Redmine issues: #3392 and #1017)
Implemented support for conditional value overwriting and uniqueness.
The jpath_set() method now optionally enables user to specify, whether already existing value should be overwritten or not, and whether value should be unique or not. The uniqueness option works only for lists at the end of JPath. (Redmine issues: #3392, #3361 and #1017, enables #3372 and #3396)
Fixed wrong isinstance() comparisons.
Previous implementation had following flaws: the collections.abc module did not exist on older versions of Python and comparing against immutable Sequence resulted in strings being treated as lists, which isnt something you need. (Redmine issue: #3392)
Performance optimization of mentat.filtering.jpath.jpath_parse() function.
Implemented jpath_parse_c() as a caching variant of jpath_parse() function. Benchmarking confirmed huge performance improvement, see mentat.filtering.benchmark.bench_jpath.py for details. Another small improvement was archived by moving compilation of chunk regular expression to global module variable (14% performance increase). All module functions now use this caching variant internally. Also implemented cache_size() and cache_clear() functions for cache management. Beware, that jpath_parse_c() function does not make deep copy of returned value from cache (for performance reasons). Treat those values as read only, or suffer the consequences. (Redmine issues: #3392 and #1017)
#2 Updated by Jan Mach almost 2 years ago
- Status changed from New to Feedback
- Assignee changed from Jan Mach to Pavel Kácha
- Priority changed from High to Normal
Because mentat-inspector.py is currently being heavily used and needs to meet expectations, I have made many improvements in key library JPath. I have polished the whole library, added documentation and unit tests. Now we should be sure, what exactly the library is capable of, know possible drawbacks and in case of need we can think of possible improvements. Reading the documentation at the head of the file should be sufficient before discussions.
Besides general polishing, I have implemented support for appending values into array, as was requested by email. However, it is currently not deployed on production server.
#4 Updated by Jan Mach almost 2 years ago
- Status changed from Feedback to In Progress
At next face to face meeting consider following options:
- Support for unique sets of values
- Start indexing at value '0’ instead of '1’
- Remove '*' index in favor of empty brackets '', or use them as alternative syntax
- Change syntax for better support of lists to:
#6 Updated by Pavel Kácha almost 2 years ago
Resolutions from meeting:
- overwrite and unique added as part of api, but not part of syntax (#3396, #3372)
- zero based indexing - yes
- Node.FirstSubelement.ListAttribute is still fine, separator is (“.“, “[”), trailing “]” then distinguishes object and array subscripts.
- Let’s stay with asterisk (Node[*]), we don’t particularly like it, but it’s explicit and consistent with possible extension to object notation (_CESNET.*).