Task #3392

Revision of JPath library

Added by Jan Mach almost 2 years ago. Updated 8 months ago.

Status:In ProgressStart date:03/25/2017
Priority:NormalDue date:
Assignee:Jan Mach% Done:

0%

Category:Development - Core
Target version:Future

Description

Because mentat-inspector.py is currently being heavily used and expectations are getting bigger, it would be wise to perform revision of the library and finish the implementation, so that everything works as expected and mentat-inspector.py delivers.

JPath is simplified version of JSONPath and can be used to addressing nodes within arbitrary data structure composed of dict-like and list-like objects. Basically it can be used for any data structure of objects implementing Python list and/or dict interface.

The motivation for implementing this module were following two use cases:

  1. Writing of simple rules in filtering expressions, for example:
        Source.IP4 in [192.168.0.0/24, 192.168.0.0/24]
# Simple message modifications based on the key => value rules, for example:
        "Source[1].Type[*]" = "source type tag" 

The obvious first choice as a solution was the jsonpath-rw library. The full JSONPath however seems to be too big of a gun for our needs and in some cases it could even enable users to cut branch they are sitting on. For this reason we have designed this simplified version with only basic features.

Documentation must contain examples for common use cases.


Related issues

Related to Mentat - Task #3393: Revision of pynspect library Closed 03/25/2017

Associated revisions

Revision 0ba242ed
Added by Jan Mach almost 2 years ago

Fix: Finished the implementation of appending to the list.

Previously the appending to the list did not work, it was necessary to update path_set() method to make use of '*' index modifier in JPath. (Redmine issue: #3392)

Revision 62c51ad2
Added by Jan Mach almost 2 years ago

Implemented jpath_exists() method.

This method simply checks for existence of given JPath element and returns True or False. (Redmine issue: #3392)

Revision 7c56581b
Added by Jan Mach almost 2 years ago

Refactoring from parse_path() to jpath_parse() and created custom JPathException class.

Refactored method name and created custom exception class to be thrown instead of generic Exception. (Redmine issue: #3392)

Revision 9252cd11
Added by Jan Mach almost 2 years ago

Improved documentation and code readability in jpath.py library.

(Redmine issues: #3392 and #3361)

Revision 13f3791d
Added by Jan Mach almost 2 years ago

Refactoring of all path_...() methods to jpath_...().

Refactored names of all remaining path_...() methods to jpath_...() to be more clear and consistent. (Redmine issue: #3392)

Revision 7c3d1a9b
Added by Jan Mach almost 2 years ago

Improved unit tests for mentat.filtering.jpath module.

(Redmine issues: #3392 and #1017)

Revision 6314546d
Added by Jan Mach almost 2 years ago

Added path and match attributes to JPath chunks.

When parsing JPath into chunks, path and match attributes are added to the result to keep the information about preceding nodes within the chunk and enable better debugging output about possible failures. (Redmine issue: #3392)

Revision efaa9467
Added by Jan Mach almost 2 years ago

Stability improvements in library mentat.filtering.jpath.

Improved error handling in JPath library, better input validation. Additions include unit tests. (Redmine issues: #3392 and #1017)

Revision 6803d78a
Added by Jan Mach almost 2 years ago

Ensured allowed character set for node names in JPath.

Explicit list of allowed characters instead of "\w”, including unit tests and documentation. (Redmine issues: #3392, #3361 and #1017)

Revision a8355b5a
Added by Jan Mach almost 2 years ago

Documentation improvements.

Tried to make the section about node delimiters and indices more clear. (Redmine issues: #3392 and #3361)

Revision 6ef84064
Added by Jan Mach almost 2 years ago

Documentation improvements for mentat.filtering.jpath module.

Iproved section about node delimiters and default dict handling behavior. Made sure the documentation displays nicelly in pydoc3. Improved formating for Sphinx-doc documentation builder. (Redmine issues: #3392 and #3361)

Revision d214e620
Added by Jan Mach almost 2 years ago

Bugfix: JPath library was not working with IDEA messages.

The JPath library did not work with IDEA messages, there are no dicts and lists. Isinstance checks had to be updated accordingly. Added assertions for working with IDEA messages into unit test case to prevent this from happening in the future. (Redmine issues: #3392 and #1017)

Revision bf0598d3
Added by Jan Mach almost 2 years ago

Implemented support for conditional value overwriting and uniqueness.

The jpath_set() method now optionally enables user to specify, whether already existing value should be overwritten or not, and whether value should be unique or not. The uniqueness option works only for lists at the end of JPath. (Redmine issues: #3392, #3361 and #1017, enables #3372 and #3396)

Revision ae7b7047
Added by Jan Mach almost 2 years ago

Fixed wrong isinstance() comparisons.

Previous implementation had following flaws: the collections.abc module did not exist on older versions of Python and comparing against immutable Sequence resulted in strings being treated as lists, which isnt something you need. (Redmine issue: #3392)

Revision ae6231f0
Added by Jan Mach almost 2 years ago

Performance optimization of mentat.filtering.jpath.jpath_parse() function.

Implemented jpath_parse_c() as a caching variant of jpath_parse() function. Benchmarking confirmed huge performance improvement, see mentat.filtering.benchmark.bench_jpath.py for details. Another small improvement was archived by moving compilation of chunk regular expression to global module variable (14% performance increase). All module functions now use this caching variant internally. Also implemented cache_size() and cache_clear() functions for cache management. Beware, that jpath_parse_c() function does not make deep copy of returned value from cache (for performance reasons). Treat those values as read only, or suffer the consequences. (Redmine issues: #3392 and #1017)

Revision 25ab08cc
Added by Jan Mach over 1 year ago

Fix: Fixed the bug in logical expression in JPath library.

The bug caused the JPath library to fail when retrieving values from indexed IDEA message object representations. (Redmine issue: #3392)

History

#1 Updated by Jan Mach almost 2 years ago

  • Description updated (diff)

#2 Updated by Jan Mach almost 2 years ago

  • Status changed from New to Feedback
  • Assignee changed from Jan Mach to Pavel Kácha
  • Priority changed from High to Normal

Because mentat-inspector.py is currently being heavily used and needs to meet expectations, I have made many improvements in key library JPath. I have polished the whole library, added documentation and unit tests. Now we should be sure, what exactly the library is capable of, know possible drawbacks and in case of need we can think of possible improvements. Reading the documentation at the head of the file should be sufficient before discussions.

Besides general polishing, I have implemented support for appending values into array, as was requested by email. However, it is currently not deployed on production server.

#3 Updated by Pavel Kácha almost 2 years ago

  • Assignee changed from Pavel Kácha to Jan Mach

Takže chápu-li to dobře, # přidává dict a položky do něj, * list a položky do něj?

#4 Updated by Jan Mach almost 2 years ago

  • Status changed from Feedback to In Progress

At next face to face meeting consider following options:

  1. Support for unique sets of values
  2. Start indexing at value '0’ instead of '1’
  3. Remove '*' index in favor of empty brackets '[]', or use them as alternative syntax
  4. Change syntax for better support of lists to:
    Node.[1].FirstSubelement.ListAttribute.[1]

#5 Updated by Jan Mach almost 2 years ago

  • Related to Task #3393: Revision of pynspect library added

#6 Updated by Pavel Kácha almost 2 years ago

Resolutions from meeting:

  • overwrite and unique added as part of api, but not part of syntax (#3396, #3372)
  • zero based indexing - yes
  • Node[1].FirstSubelement.ListAttribute[1] is still fine, separator is (“.“, “[”), trailing “]” then distinguishes object and array subscripts.
  • Let’s stay with asterisk (Node[*]), we don’t particularly like it, but it’s explicit and consistent with possible extension to object notation (_CESNET.*).

#7 Updated by Jan Mach almost 2 years ago

  • Description updated (diff)

Documentation must contain examples for common use cases.

#8 Updated by Jan Mach 8 months ago

  • Target version changed from 2.0 to Future
  • Parent task deleted (#3376)

Also available in: Atom PDF