Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data mismatch #14

Open
jn0 opened this issue Jan 20, 2017 · 5 comments
Open

Data mismatch #14

jn0 opened this issue Jan 20, 2017 · 5 comments

Comments

@jn0
Copy link

jn0 commented Jan 20, 2017

Look at this example:

    from xml.etree.ElementTree import fromstring
    import xmljson, json
    bf=xmljson.BadgerFish(dict_type=xmljson.OrderedDict)
    q=bf.data(fromstring('<a p="1">x<b r="2">y</b>z</a>'))
    print json.dumps(q,indent=2) # note this item ^ (z)!

Output will be:

    {
      "a": {
        "@p": 1, 
        "$": "x", 
        "b": {
          "@r": 2, 
          "$": "y"
        }
      }
    }

Where is z value?

Tested with

  • Python 2.7.12 (default, Nov 19 2016, 06:48:10)
  • IPython 2.4.1 -- An enhanced Interactive Python
  • Ubuntu 16.04.1 LTS (4.4.0-57-generic) x86_64

The xmljson was installed via pip.

I'd expect something like

    {
      "a": {
        "@p": 1, 
        "$": "x", 
        "b": {
          "@r": 2, 
          "$": "y"
        },
        "$$": "z"
      }
    }
@jn0
Copy link
Author

jn0 commented Jan 20, 2017

Plus, I'd like to preserve XML comments too.
Say, under ! "property name" (and "serialize" them the same way: !, !!, !!!, etc):

{
  "!": "comment 1",
  "some": { "more": "JSON here" },
  "!!": "comment 2"
}

@sanand0
Copy link
Owner

sanand0 commented Jan 20, 2017

@jn0 -- on the comments and text fragments (your "z"), the BadgerFIsh convention is silent. There is a bi-directional extension that uses $1, $2, etc for text fragments and !1, !2, etc for comments -- but this is not backward compatible with BadgerFish.

Also, if we did extend this, I'd like it to also work (to the extent possible) for the other conventions we're implementing -- i.e. GData, Yahoo and Parker.

Any thoughts on how you might structure the JSON attributes for these?

@jn0
Copy link
Author

jn0 commented Jan 20, 2017

@sanand0 not much actually: I'm a newbie here, in XML land :)
But $2 and !2 look no worse than $$ and !! (as well as #2 for CDATA).
It looks quite obvious to me that loosing parts of the source isn't good enough anyway.
Maybe, just add , bidirectional=False to the BadgerFish constructor and act respectively?

The only point is to grab the parts in traditional dict into a tuple for the non-bidirectional mode, I think.
This will loose the exact positions, but still preserve values...

@AlexandraBomane
Copy link

Hi @sanand0 !

I have the same problem as @jn0 :
some data miss in my json output. I think that there is a problem of recursivity in your parser.
Can you have a look on that, please ?

Best,
Alexandra

@dagwieers
Copy link
Contributor

This problem also impacts the Abdera and Cobra conventions I implemented.
The problem itself is indicated as a TODO (and commented test) in the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants