Applications store data in different formats. And, just like the saying, “apples to apples,” to exchange data between applications, you must convert the source data into a format the destination application can understand. Therefore, it’s important that you use a data format that’s compatible with different applications.
Enter eXtensible Markup Language or XML. XML data is stored in plain text. It’s both human and machine-readable and is hardware-independent, making XML data very portable. Using an XML API, you can easily share XML data across different applications, browsers, or operating systems.
The XML API receives data from a database, then converts it into XML format to be sent to another application that accepts XML inputs. This effectively allows you to treat your database as if it were structured in XML.
This article explores how you can leverage an XML API to convert information from a database into an XML format to simplify information exchange with other applications.
When should you use XML APIs?
Let’s explore some scenarios where you’ll want to use XML APIs instead of the “raw” information from the database, and why XML APIs are beneficial in these cases.
1. Sharing Data Between Systems
XML APIs are a good option when you need to share data between systems. XML data can be used on any application as nearly every programming language has an XML parser.
By using XML to exchange data, you can also use tools like Extensible Stylesheet Language Transformation (XSLT) and XML Schema Definition (XSD) to further process data. These tools come bundled with XML, so you don’t need to rely on other third-party software.
2. Standardizing the Data Schema
If you use an XML API when creating an application that receives data from external sources, you can construct a common schema that defines how the data should be structured. This way, those sending the data only need to validate their data against the defined schema before sending it. This allows you to easily consume data within the application, as you always know how it’s structured.
3. Data Storage
You may choose to use an XML API over a database if the application data will be viewed in different formats or on different devices. For instance, PDF files, Word documents, HTML files, CSV files, and RTF files can be stored as XML. They can then be parsed and formatted by an application that renders the content in its original format.
4. Converting Database Tables to HTML
With an XML API, you can convert database tables to HTML pages easily. You can use the XSLT stylesheet to generate the HTML pages from a table represented as an XML file. With XSLT, you can also generate other file formats, like text files.
Parsing XML Documents With DOM and SAX
To parse XML documents, you can use a Document Object Model (DOM) or a Simple API for XML (SAX). DOM represents the XML document in a tree-like structure. Each element in the DOM tree is modeled like a node object that can be accessed, modified, or deleted.
DOM is different from the SAX in that it has the whole tree structure in memory, while SAX doesn’t. Instead, SAX relies on events to traverse all the elements in the document. For example, SAX emits an event when it encounters an opening tag, the closing tag, and the characters in between. Because of this, this SAX can read small chunks of the XML document, therefore consuming less memory. This makes SAX the best option for parsing large XML files.
Now, let’s explore a high-level look at how to implement DOM and SAX on a database, starting with the Simple API for XML.
How to Use the Simple API for XML (SAX)
With XML APIs, you want to parse the rows and columns of a table in the database. To do so, you begin by first creating a method that connects to the database and returns the table.
Step 1: Create a method that connects the database to the table.
This method should receive the database connection details like the database host, username, port, password, database name, and the name of the table you want to parse. It should return a result set containing the table data.
There are several things you need to keep in mind when converting the result set to an XML document:
- All the elements in the document must contain start and end tags and all the attribute values must be in quotation marks.
- Avoid having blank elements in the XML file, as parsing them can return unpredictable results. You can do this by providing default values for the database columns.
- You also need to format invalid characters. These are characters that are not allowed in XML and need to be escaped. The XML specification has a reference for the global list of allowed characters and restricted characters. If your data contains restricted characters, you need to escape them to prevent the parser from throwing an error. Some of these characters include <, &, >, ‘, and “.
Step 2: Validate the XML document.
After converting the result set to a well-formed XML document, it needs to be validated. This ensures the XML file structure follows the rules and constraints defined in a schema.
One of the common schema validation languages is XML Schema Definition (XSD). An XSD schema provides the rules that an XML document must conform to, to be valid. These rules can include the allowed names for the elements, the attribute’s data types, or the overall schema to which the XML file must adhere. The XML document is validated against the XSD schema and if it isn’t valid, the XML API throws an exception error.
With the result set converted to a valid XML file, you can use the SAX parser.
Step 3: Use the SAX parser to parse the XML file.
The next step is to create the XML parser object to parse the XML file. This object allows you to work with the XML document. It also contains functions that validate XML documents. For a SAX parser, the parser object takes the file and reads it node by node while emitting events at each of them. The SAX parser can be of two types:
- A non-validating parser — The parser does not check for the validity of the XML input file.
- A validating parser — The application includes a schema that the parser validates the XML input file against.
Validating the file ensures the XML file being parsed has the correct tags, uses valid names, and doesn’t violate the constraints defined in the XSD. When validity errors are raised during the validation process the parser throws exceptions. These errors are handled through the errorHandler interface provided by SAX. To implement this interface in the application, register an instance with the XML reader using the setErrorHandler method. Now all the errors raised by the parser will be reported.
The SAX parser also provides the methods for parsing an input XML document. These methods include the content handler, the error handler, and the one used to set the validation mode.
The content handler method is responsible for handling the events emitted by the SAX parser.
These events are the startDocument() and endDocument() events, which show where the file begins and ends, and the startElement() and the endElement() events, which mark the beginning and end of each element. There are also the character events that are emitted when the parser encounters the text in between the XML elements.
To invoke these methods and start the parsing process, pass the XML input file to the parser.
This calls the callback methods defined in the application, such as the ContentHandler() method, which can emit the startDocument() event at the beginning of the file.
Now, let’s discuss how you can use DOM to parse XML.
How to Use Document Object Model (DOM)
As previously mentioned, DOM loads all the data at once and presents it as a tree structure. The DOM parser object works slightly differently from the SAX parser object. It doesn’t read the XML file node by node but rather parses the file as a collection of nodes which make up a DOM tree.
The validation process is quite similar, except that the DOM parser object is you. The XML file is validated against an XSD file to check for structural errors as it’s loaded into the DOM. For the application to report validity errors raised during validation, the schema factory must be configured, the error handler method set, and the XSD schema included in the program.
To parse the file, the DOM parser creates the nodes and node list from the XML file. The parser then gets the elements and attributes from each node. Here, a simple loop can be used to iterate over the nodes.
Using XML APIs to Manage Business Data
Exchanging data between applications running on different systems can be challenging. By using data formatted in XML, you eliminate this challenge. Because XML is a software and hardware-independent format, it provides you with an easy way to store and exchange files from a database in one application to another without worrying about compatibility.
This article demonstrated how you can use SAX and DOM to connect and parse data from a database. While both of these can parse XML, they’re quite different: SAX is event-driven and parses content in parts, while DOM parses all the content of a file at once. So, when choosing a parsing method, be sure to consider how you need the data parsed. When you need to be conscious of the memory used, use SAX. However, if you need to access the parsed content at random, opt for DOM.