XML

XML through VC++

(Please note that is article is focused more on using MSXML API's with VC++).

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards[1].


This is nowadays "the" standard for structured data-exchange - "standard" means that everybody (all the technologies, platforms etc.) know this. "structured" - means that you do know what does sender mean say when it says <year>2013</year> - wow! I learn it just a while ago - I used it with COBOL (Microfocus), and with VC++ (2008), The approaches I took to create and parse XML files are given below. 


For all the examples given below, please use this example...OK, not a very good one, but just consider:
<?xml version="1.0" encoding="UTF-8" ?>
<countries>
<country continent="Asia" rank="1"> India
   <GDP> 8.00 </GDP>
</country>
<country continent="Asia" rank="12"> Some other country...
   <GDP> 9.00 </GDP>
</country>
</countries>



Approach-1, the old, legacy way!

Print (and parse too...) each and every tag and element, in other words if you want to pass, say country value, do it like:
sprintf_s( szXML, "%s %s %s", szCountryTag, szCounteryValue,szCountryTagEnd);
or in COBOL:

STRING "<Country>" WS-COUNTRY-NAME </COUNTRY>
INTO WS-XML


Well.....it is a pain, ain't it?

The second, efficient approach is to use the "parsers" provided with compilers. Rest part of this article is going to talk more about that, and it will focus especially on VC++

Approach-2. Use a parser -

 Options for VC++ (...the ones which I'm aware off):
  1. SAX2 (Simple API for XML).
  2. MSXML 6.0, with DOM (Document Object Model) API’s - this one will be discussed in details

For COBOL - Microfocus [2] does give enough documentation on "cbl2xml", but I think this depends upon the version of your compiler too, please refer of the documentation provided with your compiler for more details. IBM too says that XML parsing is nowadays available for COBOL[4]..again, please refer to documentations.

Using MSXML 6.0, with DOM (Document Object Model) API’s in VC++

What to be included / imported etc.:

#import "msxml6.dll"


This will generate the type library information contained in msxml6.dll [3] - and you can simple start using all the typedef etc. defined in this library.

 Part-1: Parsing an XML document (i.e. reading an XML file and then getting tags, elements and attributes:


How to start using it...simply use the "MSXML2" definitions as shown below:

1. Declaring variables
(all the variables in examples given below are <hold your breath> pointers </hold..>



try { //*Yes.....use this for error handling
   MSXML2::IXMLDOMDocumentPtr docPtr;
   MSXML2::IXMLDOMNodeListPtr NodeListPtr;
   MSXML2::IXMLDOMElementPtr Element;
   MSXML2::IXMLDOMParseErrorPtr error;
   HRESULT hr=S_FALSE;

2. Initialize the XML Part
 CoInitialize(NULL); 
   docPtr.CreateInstance("Msxml2.DOMDocument.6.0"); //Create a DOM object
/*Load & open the XML document specified in input file*/
   _variant_t varXml("WhateverMyFileNameIs");
   _variant_t varOut((bool)TRUE);
   varOut = docPtr->load(varXml);
   if ((bool)varOut == FALSE)
      throw( "Exception in loading the XML document")
/**/
   NodeListPtr=NULL;
   hr = docPtr->getElementsByTagName("countries");
   if(!SUCCEEDED(hr)){
        throw("Root node  countries  is NOT present!");
   } else {
     NodeListPtr = docPtr->getElementsByTagName(" countries ");
   }
   if (NodeListPtr->length>1) {
      throw("Unlikely, but 1+  countries root-nodes!");

3. Getting the value of  a node: e.g. "country" node, as in  <country> India </country>
    NodeListPtr=NULL;
   hr=docPtr->getElementsByTagName("country");
    if(!SUCCEEDED(hr)){
       throw("country node is NOT present!");
     } else {
        NodeListPtr = docPtr->getElementsByTagName("country");
        StringCountry=NodeListPtr->item[0]->text;
    }
4. Going through _all_ the nodes and getting value - go thorugh a loop and save this - Tags and Text values to Vector:
   NodeListPtr = docPtr->getElementsByTagName("*");
   iNoOfTags=0
   for (iCounter = 0 ; iCounter < NodeListPtr->length; iCounter++){
       Element = NodeListPtr->item[iCounter];
       STag=Element->nodeName;
       SValue==Element->text;
       SomeVecotr.push_back(TagNValueJustRead);
      iNoOfTags++;
} /*End For */

5. Getting an the attribute of a node, e.g. <country "continent=asia" "rank=1"> India </Country>
...Yes, I do know that using all these attributes - even if we've got all the flexibility of using "nodes" _is_ sacrilegious, but then since it is there, we need to know how to get'em out!

    MSXML2::IXMLDOMNodePtr attrNode = NULL;
    NodeListPtr = docPtr->getElementsByTagName("country");
    if (NodeListPtr == NULL) 
       throw("Well, country is not there!")

    NodeListPtr=NodeListPtr->item[0]->childNodes;
    if (NodeListPtr == NULL) 
      return 1;

    int iNoOfChildren=NodeListPtr->length;
    for (iCounter = 0 ; iCounter < iNoOfChildren; iCounter++){
          Element = NodeListPtr->item[iCounter];
         SCounryName=Element->text;
         attrNode = Element->attributes->getNamedItem("continent");
         SContinent=atoi(attrNode->text); 

        attrNode = Element->attributes->getNamedItem("rank");
        SContinent=atoi(attrNode->text);
  .....store / process this data.
}/*End for*/

Part-2: Creating an XML document:



 1. As usual...try, declare, initialize etc...

try {
   MSXML2::IXMLDOMDocumentPtr docPtr;
   MSXML2::IXMLDOMElementPtr rootElement, Element;

   hr=CoInitialize(NULL); 
   hr=docPtr.CreateInstance("Msxml2.DOMDocument.6.0"); //Create a DOM object
   if(FAILED(hr)) throw("Create XML Instance failed");

   _variant_t varOut((bool)TRUE);

2. Create a root node "TheRootElement"
To create a document, you need to give the processing information as well - well, the exmaple given below it does it manually (hard-coded), the other option was to use docPtr->createProcessingInstruction() , but it used to append it at the very end of XML document. Since the process at receiving end expected it at the very top (first line), so I had to hard-code it like this:

/*Creat root-node - "countries"*/
varOut=docPtr->loadXML("<?xml version=\"1.0\" encoding=\"UTF-8\" ?> < countries ></ countries >");
if ((bool)varOut == FALSE)
throw("Error loading the document"+ __LINE__);

3. Now keep on adding all the child-nodes to it.
rootElement = docPtr->GetdocumentElement();

//1. Create a node "Country"
Element = docPtr->createElement(_T("country"));
Element->Puttext("India");
Element=rootElement->appendChild(Element);

//2. Add another country in the list...
  Element = docPtr->createElement(_T("country"));
  Element->Puttext(country2.c_str());
  Element=rootElement->appendChild(Element);

...u.s.w.

4. Save - create a physical file - after all the nodes have been created.

docPtr->save("MyNewXMLFile");
  } catch(...) { 
   ...Exceptional handling
}

5. Last...but, NOT the least!
CoUninitialize();

References:

[1] http://en.wikipedia.org/wiki/XML
[2] http://supportline.microfocus.com/documentation/books/nx50/dxxmlo.htm
[3] http://msdn.microsoft.com/en-us/library/windows/desktop/ms757018(v=vs.85).aspx
[4] http://publib.boulder.ibm.com/infocenter/iadthelp/v7r0/index.jsp?topic=/com.ibm.etools.iseries.pgmgd.doc/c0925405239.htm

No comments:

Post a Comment