I encountered an issue when I tried to get a node from RDF. Normally, I would do the following to get the node:
xmllint --xpath '//foo/bar' test.xml
But this will not work on RDF since there are namespaces involving. So, you need to tell xmllint about the namespaces:
$ xmllint --shell test.xml
/ > setrootns
/ > cd rdf:RDF
RDF > dir
ELEMENT rdf:RDF
namespace rdf href=http://www.w3.org/1999/02/22-rdf-syntax-...
default namespace href=http://purl.org/rss/1.0/
namespace taxo href=http://purl.org/rss/1.0/modules/taxonomy...
namespace syn href=http://purl.org/rss/1.0/modules/syndicat...
RDF > setns a=http://purl.org/rss/1.0/
RDF > cd a:item[1]
item > dir
ELEMENT item
ATTRIBUTE about
TEXT
content=http://example.com
item >
The setrootns is for selecting <rdf:RDF/> node by letting xmllint create rdf namespace prefix for you. That is not necessary if you are not interested in that node. You will need to assign a prefix for the default namespace in order to selecting the node you want. The name of prefix isn’t important, just pick up a random name will do.
Unfortunately, I don’t see anyway that I can do it without --shell. So, you will need to run xmllint like:
echo -e 'setns a=http://purl.org/rss/1.0/\ncat //a:item[1]/a:description/text()' | xmllint --shell test.xml
Then you parse the output. Remove unwanted lines of shell usage, etc.
If you don’t want to go through above, you can use *[1] to select or remove the namespace parts from the source, that should do if you like that way.
Note
You may want to try xmlstarlet, it’s easier to query with namespaces. (2012-06-29T17:03:45Z)
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.