[Xml basic concept introduction]
xml stands for extensible markup language.
XML is designed to transfer and store data.
<foo>#start tag of foo element </foo>#end tag of foo element #note:Each start tag must have a corresponding end tag to close it. Can also be written as<foo />
<foo>#elements can be nested to arbitrary arguments <bar></bar>#The bar element is a child of the foo element </foo>#end tag of parent element foo
<foo lang="en">#The foo element has a lang attribute whose value is:en;corresponding to a python dictionary (name-value) pair; <bar lang="ch"></bar>#The bar element has a lang attribute with the value:ch;and an id attribute with the value:001, placed in "" or "" ; </foo>#The lang attribute in the bar element will not conflict with the foo element,Each element has its own set of attributes;
<title>learning python</title>#element can have text content #note:If an element has no text content,There are no child elements,Is an empty element.
<info>#info element is the root node <list>a</list>#list elements are child nodes <list>b</list> <list>c</list> </info>
<feed xmlns="http://www.w3.org/2005/atom">#The default namespace can be defined by declaring xmlns,The feed element is in the http://www.w3.org/2005/atom namespace <title>dive into mark</title>#The title element is also. A namespace declaration will not only affect the element that currently declares it,Also affects all child elements of the element </feed> You can also define a namespace through the xmlns:prefix declaration and name it prefix. Each element in the namespace must then be explicitly declared with this prefix. <atom:feed xmlns:atom="http://www.w3.org/2005/atom">#feed belongs to namespace atom <atom:title>dive into mark</atom:title>#The title element also belongs to the namespace </atom:feed>#xmlns (xml name space)
[Several parsing methods of xml]
Common xml programming interfaces are dom and sax. These two interfaces handle xml files differently.Use occasions are naturally different.
Python has three ways to parse XML:sax, dom, and elementtree:
1.sax (simple api for xml)
The pyhton standard library includes a sax parser, and sax uses an event-driven model.Processes XML files by triggering events and calling user-defined callback functions during XML parsing. sax is an event-driven api. Parsing XML documents using sax involves two parts:a parser and an event handler.
The parser is responsible for reading the xml document and sending events to the event handler,Such as element start and end events;The event handler is responsible for handling the event.
Advantages:Sax streaming reads XML files, which is faster and takes up less memory.
Disadvantages:Users need to implement callback functions (handlers).
2.dom (document object model)
Parse the xml data into a tree in memory,Manipulate xml through operations on the tree. A dom parser reads the entire document at once when parsing an xml document.Save all elements of the document in a tree structure in memory,Then you can use different functions provided by dom to read or modify the content and structure of the document.You can also write the modified content to an xml file.
Pros:The advantage of using dom is that you don't need to track the status,Because every node knows who is its parent,Who are the child nodes.
Disadvantages:dom needs to map xml data to a tree in memory,One is relatively slow,The second is more memory consuming,It's more troublesome to use!
Elementtree is like a lightweight dom, with a convenient and friendly API. Good code availability,It is fast and consumes less memory.
in comparison,The third method,Easy and fast, we use it all the time! Here's how to use element tree to parse XML:
elementtree was born to process xml, and it has two implementations in the Python standard library.
One is a pure python implementation, for example:xml.etree.elementtree
The other is faster:xml.etree.celementtree
Try to use the kind implemented by C language,Because it's faster,And it consumes less memory! You can write this in your program:
try: import xml.etree.celementtree as et except importerror: import xml.etree.elementtree as et
#When i want to get the attribute value,Use the attrib method. #When i want to get the node value,Use the text method. #When i want to get the node name,Use the tag method.
<?xml version="1.0" encoding="utf-8"?> <info> <intro>book message</intro> <list> <head>bookone</head> <name>python check</name> <number>001</number> <page>200</page> </list> <list> <head>booktwo</head> <name>python learn</name> <number>002</number> <page>300</page> </list> </info>
Method one:load the file
Method two:load the string
Method 1:Get the specified node->getiterator () method
Method 2:Get the specified node->findall () method
Method 3:Get the specified node->find () method
Method 4:Get the child node->getchildren ()
for node in book_node: book_node_child=node.getchildren ()  print book_node_child.tag, "=>", book_node_child.text
#coding=utf-8 try:#import module import xml.etree.celementtree as et except importerror: import xml.etree.elementtree as et root=et.parse ("book.xml") #Parse the xml file books=root.findall ("/ list") #Find all children of list under the root directory for book_list in books:#Iterate over the search results print "=" * 30 #output format for book in book_list:#Iterate over each child node,Find out your attributes and values inside if book.attrib.has_key ("id"):#Id for conditional judgment print "id:", book.attrib ["id"] #print the attribute value based on id print book.tag + "=>" + book.text #print tags and text content print "=" * 30
================================ head =>bookone name =>python check number =>001 page =>200 ================================ head =>booktwo name =>python learn number =>002 page =>300 ================================
ps:Here are several online tools on xml operation for your reference:
Online xml/json conversion tool:
Format xml online/compress xml online:
xml online compression/formatting tool:
- Introduction to several common methods of parsing XML with Python
- In-depth interpretation of several ways Python parses XML
- Detailed interpretation of the method of parsing XML data in Python
- Parsing XML instance using SAX in Python
- Horizontal comparison analysis of four ways to parse XML in Python
- Python parsing method of dom element in XML
- Application examples of Python parsing XML through DOM and SAX
- python - you may need to restart the kernel to use updated packages error
- php - coincheck api authentication doesn't work
- php - i would like to introduce the coincheck api so that i can make payments with bitcoin on my ec site
- [php] i want to get account information using coincheck api
- the emulator process for avd pixel_2_api_29 was killed occurred when the android studio emulator was started, so i would like to
- i want to call a child component method from a parent in vuejs
- python 3x - typeerror: 'method' object is not subscriptable
- dart - flutter: the instance member'stars' can't be accessed in an initializer error
- xcode - pod install [!] no `podfile 'found in the project directory